The History of DataWorks
*Desktop Version*
In September 2019, we received an NSF Smart and Connected Communities award (#1951818 DataWorks: Building Smart Community Capacity), this funding coupled with funding from GT Constellations Center for Equity in Computing, allowed us to launch Data Works in January 2020, as an experimental project. The funding from the Constellations Center allowed us to pay workers, and the funding from NSF went towards research on the project.
2019
2020
We hired four young people with high school diplomas and little computing skills who identified with groups underrepresented in computing. They started working part-time as Data Wranglers. We were still determining what kind of work we could expect from clients or the training needed to bring the employees up to speed to do that work. But with Dr. Amanda Meng, a research scientist, leading the way, we wanted to determine if we could train these young people in mid-level skills needed to clean and prepare data for data scientists to use.
The first year was challenging in many ways including pandemic stay-at-home orders. But we saw many bright spots, including several paying clients excited to work with us and increasing skill and professionalism among the workers. DataWorks was established as a cost center at Georgia Tech and launched it as a full-time workforce in January 2021. At that time, we expanded to five people employed as data wranglers and a new research scientist acting as a day-to-day manager. DataWorks brought in more clients, established a project management process, and the workers developed their technical skills in Excel and other data management tools.
2021
2022
In 2022 we moved one of the Data Wranglers into the project management role. We started expanding our services to include data collection and annotation and implementing a training program that included technical skills with advanced Excel training, Python scripting and web scraping, critical data literacy, qualitative data collection and annotation methods, and career networking and job search skills. DataWorks’ client list has also expanded, with client work covering about half of our operating costs.
Over the past three years, we have moved former data wranglers into management roles, helped launch several Data Wranglers into new careers, and restructured the program to focus on a 1-year fellowship with a greater emphasis on a structured training program. Data wranglers do a wide range of tasks that are more challenging than data entry but do not require a computer science degree. Most projects involve data cleaning with a blend of automation techniques and more cognitively demanding manual cleaning. Other projects are complex human-in-the-loop data annotation tasks for machine learning. Clients are non-profits, civic organizations, or academic researchers that are similarly mission-driven.
Present
*Mobile Version*
2019
In September 2019, we received an NSF Smart and Connected Communities award (#1951818 DataWorks: Building Smart Community Capacity), this funding coupled with funding from GT Constellations Center for Equity in Computing, allowed us to launch Data Works in January 2020, as an experimental project. The funding from the Constellations Center allowed us to pay workers, and the funding from NSF went towards research on the project.
2020
We hired four young people with high school diplomas and little computing skills who identified with groups underrepresented in computing. They started working part-time as Data Wranglers. We were still determining what kind of work we could expect from clients or the training needed to bring the employees up to speed to do that work. But with Dr. Amanda Meng, a research scientist, leading the way, we wanted to determine if we could train these young people in mid-level skills needed to clean and prepare data for data scientists to use.
2021
The first year was challenging in many ways including pandemic stay-at-home orders. But we saw many bright spots, including several paying clients excited to work with us and increasing skill and professionalism among the workers. DataWorks was established as a cost center at Georgia Tech and launched it as a full-time workforce in January 2021. At that time, we expanded to five people employed as data wranglers and a new research scientist acting as a day-to-day manager. DataWorks brought in more clients, established a project management process, and the workers developed their technical skills in Excel and other data management tools.
2022
In 2022 we moved one of the Data Wranglers into the project management role. We started expanding our services to include data collection and annotation and implementing a training program that included technical skills with advanced Excel training, Python scripting and web scraping, critical data literacy, qualitative data collection and annotation methods, and career networking and job search skills. DataWorks’ client list has also expanded, with client work covering about half of our operating costs.
Present
Over the past three years, we have moved former data wranglers into management roles, helped launch several Data Wranglers into new careers, and restructured the program to focus on a 1-year fellowship with a greater emphasis on a structured training program. Data wranglers do a wide range of tasks that are more challenging than data entry but do not require a computer science degree. Most projects involve data cleaning with a blend of automation techniques and more cognitively demanding manual cleaning. Other projects are complex human-in-the-loop data annotation tasks for machine learning. Clients are non-profits, civic organizations, or academic researchers that are similarly mission-driven.
*Tablet Version*
In September 2019, we received an NSF Smart and Connected Communities award (#1951818 DataWorks: Building Smart Community Capacity), this funding coupled with funding from GT Constellations Center for Equity in Computing, allowed us to launch Data Works in January 2020, as an experimental project. The funding from the Constellations Center allowed us to pay workers, and the funding from NSF went towards research on the project.
2019
2020
We hired four young people with high school diplomas and little computing skills who identified with groups underrepresented in computing. They started working part-time as Data Wranglers. We were still determining what kind of work we could expect from clients or the training needed to bring the employees up to speed to do that work. But with Dr. Amanda Meng, a research scientist, leading the way, we wanted to determine if we could train these young people in mid-level skills needed to clean and prepare data for data scientists to use.
The first year was challenging in many ways including pandemic stay-at-home orders. But we saw many bright spots, including several paying clients excited to work with us and increasing skill and professionalism among the workers. DataWorks was established as a cost center at Georgia Tech and launched it as a full-time workforce in January 2021. At that time, we expanded to five people employed as data wranglers and a new research scientist acting as a day-to-day manager. DataWorks brought in more clients, established a project management process, and the workers developed their technical skills in Excel and other data management tools.
2021
2022
In 2022 we moved one of the Data Wranglers into the project management role. We started expanding our services to include data collection and annotation and implementing a training program that included technical skills with advanced Excel training, Python scripting and web scraping, critical data literacy, qualitative data collection and annotation methods, and career networking and job search skills. DataWorks’ client list has also expanded, with client work covering about half of our operating costs.
Over the past three years, we have moved former data wranglers into management roles, helped launch several Data Wranglers into new careers, and restructured the program to focus on a 1-year fellowship with a greater emphasis on a structured training program. Data wranglers do a wide range of tasks that are more challenging than data entry but do not require a computer science degree. Most projects involve data cleaning with a blend of automation techniques and more cognitively demanding manual cleaning. Other projects are complex human-in-the-loop data annotation tasks for machine learning. Clients are non-profits, civic organizations, or academic researchers that are similarly mission-driven.