Welcome to inspiration, learning, instruction, networking and collaboration! Buy your ticket here!
Back To Schedule
Tuesday, September 29 • 14:00 - 15:15
Lessons learned extracting data from documents

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Some of the most interesting datasets started life 'unstructured' -- as documents, emails, web pages, images, videos, and other formats that look nothing like a spreadsheet. This talk will cover the challenges in extracting data from these formats, what tools are available, and approaches for verifying the results. No existing technical knowledge required.

avatar for Adriana Homolová

Adriana Homolová

Proud coordinator of the data skills training 📊, ARENA, Follow The Money, Lost in Europe
Adriana is a freelance data journalist, trainer and public spending nerd. At Dataharvest, she coordinates the data skills training. At other times, she writes scrapers and investigates European Union for Follow The Money's Bureau Brussel and collects data on missing children in migration... Read More →

avatar for Max Harlow

Max Harlow

Financial Times
Max Harlow works on the visual and data journalism team at the Financial Times in London, where he focuses on using data to find and tell investigative stories. @maxharlow

Tuesday September 29, 2020 14:00 - 15:15 CEST