About Us
DataWorks is a unique workplace training program and data services provider. It provides businesses, non-profits, and civic organizations with data cleaning, analysis, and annotation services and trains young people from communities historically minoritized in computing.
DataWorks operates with seven employees: five Data Fellows, one Data Trainer, and a Project Manager. The team works full-time as Georgia Tech employees but operate as a small company inside the institution, providing data services to external clients and internal projects.
Data Works is a data service provider that focuses on cleaning and preparing data for analysis, data annotation, and providing non-profit and civic organizations with data collection and analysis with minoritized populations in Atlanta.
Georgia Tech DataWorks operates today with six employees, four of whom work as Data Fellows, one Data Trainer, and a Project Manager. The team works full-time as Georgia Tech employees but operate as a small company inside the institution, providing data services to external clients and internal projects.
DataWorks has five primary goals:
1
To provide data services to non-profit and civic organizations, researchers and corporate clients.
2
To provide computing education in a work environment that meets the needs of adults with limited technical backgrounds but with great potential to help broaden the demographic of people working in computing.
3
To conduct research to understand better how to create new pathways into computing to attract more diverse workers to data sciences.
4
To research and develop approaches to community engagement and redistribute the wealth of the university beyond itself and its standard beneficiaries.
5
To create a model for co-constructing a work environment where the labor of data work is respected, and the workers have agency in their jobs, resisting exploitive work practices.
DataWorks has five primary goals:
1
To provide data services to non-profit and civic organizations, researchers and corporate clients.
2
To provide computing education in a work environment that meets the needs of adults with limited technical backgrounds but with great potential to help broaden the demographic of people working in computing.
3
To conduct research to understand better how to create new pathways into computing to attract more diverse workers to data sciences.
4
To research and develop approaches to community engagement and redistribute the wealth of the university beyond itself and its standard beneficiaries.
5
To create a model for co-constructing a work environment where the labor of data work is respected, and the workers have agency in their jobs, resisting exploitive work practices.
Learn About the History of DataWorks
The History of DataWorks
2019 – 2020
In September 2019, we received an NSF Smart and Connected Communities award (#1951818 DataWorks: Building Smart Community Capacity), this funding coupled with funding from GT Constellations Center for Equity in Computing, allowed us to launch Data Works in January 2020, as an experimental project. We hired four young people with high school diplomas and little computing skills who identified with groups underrepresented in computing. They started working part-time as Data Wranglers. We were still determining what kind of work we could expect from clients or the training needed to bring the employees up to speed to do that work. But with Dr. Amanda Meng, a research scientist, leading the way, we wanted to determine if we could train these young people in mid-level skills needed to clean and prepare data for data scientists to use. The funding from the Constellations Center allowed us to pay the workers, and the funding from NSF went towards research on the project.
2021
The first year was challenging in many ways including pandemic stay-at-home orders. But we saw many bright spots, including several paying clients excited to work with us and increasing skill and professionalism among the workers. DataWorks was established as a cost center at Georgia Tech and launched it as a full-time workforce in January 2021. At that time, we expanded to five people employed as data wranglers and a new research scientist acting as a day-to-day manager.
2022
During 2021 DataWorks brought in more clients, established a project management process, and the workers developed their technical skills in Excel and other data management tools. In 2022 we moved one of the Data Wranglers into the project management role. We started expanding our services to include data collection and annotation and implementing a training program that included technical skills with advanced Excel training, Python scripting and web scraping, critical data literacy, qualitative data collection and annotation methods, and career networking and job search skills. DataWorks’ client list has also expanded, with client work covering about half of our operating costs.
Present
Over the past three years, we have moved former data wranglers into management roles, helped launch several Data Wranglers into new careers, and restructured the program to focus on a 1-year fellowship with a greater emphasis on a structured training program. Data wranglers do a wide range of tasks that are more challenging than data entry but do not require a computer science degree. Most projects involve data cleaning with a blend of automation techniques and more cognitively demanding manual cleaning. Other projects are complex human-in-the-loop data annotation tasks for machine learning. Clients are non-profits, civic organizations, or academic researchers that are similarly mission-driven.
2019 – 2020
In September 2019, we received an NSF Smart and Connected Communities award (#1951818 DataWorks: Building Smart Community Capacity), this funding coupled with funding from GT Constellations Center for Equity in Computing, allowed us to launch Data Works in January 2020, as an experimental project. We hired four young people with high school diplomas and little computing skills who identified with groups underrepresented in computing. They started working part-time as Data Wranglers. We were still determining what kind of work we could expect from clients or the training needed to bring the employees up to speed to do that work. But with Dr. Amanda Meng, a research scientist, leading the way, we wanted to determine if we could train these young people in mid-level skills needed to clean and prepare data for data scientists to use. The funding from the Constellations Center allowed us to pay the workers, and the funding from NSF went towards research on the project.
2021
The first year was challenging in many ways including pandemic stay-at-home orders. But we saw many bright spots, including several paying clients excited to work with us and increasing skill and professionalism among the workers. DataWorks was established as a cost center at Georgia Tech and launched it as a full-time workforce in January 2021. At that time, we expanded to five people employed as data wranglers and a new research scientist acting as a day-to-day manager.
2022
During 2021 DataWorks brought in more clients, established a project management process, and the workers developed their technical skills in Excel and other data management tools. In 2022 we moved one of the Data Wranglers into the project management role. We started expanding our services to include data collection and annotation and implementing a training program that included technical skills with advanced Excel training, Python scripting and web scraping, critical data literacy, qualitative data collection and annotation methods, and career networking and job search skills. DataWorks’ client list has also expanded, with client work covering about half of our operating costs.
Present
Over the past three years, we have moved former data wranglers into management roles, helped launch several Data Wranglers into new careers, and restructured the program to focus on a 1-year fellowship with a greater emphasis on a structured training program. Data wranglers do a wide range of tasks that are more challenging than data entry but do not require a computer science degree. Most projects involve data cleaning with a blend of automation techniques and more cognitively demanding manual cleaning. Other projects are complex human-in-the-loop data annotation tasks for machine learning. Clients are non-profits, civic organizations, or academic researchers that are similarly mission-driven.