The International Development Jargon Detector counts jargon words in reports, presentations and other documents. Upload a file below to start catalyzing sustainable change:
Don't have a report to use? Try one of these.
The International Development Jargon Detector (IDJD) is a bottom-up, data-driven approach to empowering underserved beneficiaries in the field. It was also a way to play with text extraction and the Natural Language Toolkit in Python.
The IDJD uses a pre-defined list of "jargon" words. It extracts text from most common file formats and counts how many times the uploaded text contains words from the list. Word stems are used for counting so, for example, "sustain", "sustaining" and "sustainability" are considered the same. Stop words like "of", "it", "the" are ignored.
Most of them! .txt and .doc(x) will work, but so will .pdf and .ppt, and even .csv and .xls(x)
The main limitation is that words are compared one at a time, out of context. This misses phrases like "results oriented" or "in the field". It also means the IDJD can't make qualitative distinctions – "capacity" is considered jargon whether we "built stakeholder capacity" or "installed a water tank with a 10,000L capacity".
These deficiencies will be addressed in a future version, pending donor funding.
Great question. Here are some articles and blog posts:
Yes, of course. It's very interesting.