Tagging Automation on TJC
asked 2015-01-16 22:11:54 +0200
This post is a wiki. Anyone with karma >75 is welcome to improve it.
Automated tag generation based on words and clauses in questions, comments and answers could encourage content clean-up on together.jolla.com.
All terms would be managed. High-frequency words would filter out using regular expressions. Low frequency content like typographical errors would become more obvious and get revised. Troubling activity or content might stand out more, to then be effectively addressed.
Non-native English users could get notified of potential improvements to the fluency of their writing. It's nicer to get feedback from a machine signalling "hey, perhaps this would be an improvement" than to know that your actions distracted someone from other more valuable parts of their job.
The way I see it, a vital problem-solution dynamism that together.jolla.com encourages would strengthen with such a tool set.
Its essence would be a moderated index of words and concepts with metainformation, statistics, messages and links, accessible to varying degrees by members, more so by moderators, with advisories and reports available internally and externally, privately and publically according to needs and policies.
This could make the site tend more towards seeming like a reference book and a compelling story, than a throng of variously noisy competing interests. At least, I believe this is a way of tackling issues in a fashion that lends strongly to user empowerment and encouragement, yet seems less burdensome on both the paid and the otherwise benevolent participants in this environment.
There may be 77,000 questions and texts to power through. A vitality statistic to show by each listed word or term will be age range, based on the date of the first and last accepted posts in which the text was present.
This is now editable by everyone so please add regular expressions below, not in the comments, because comments do not stay editable long.
Regular expressions, to use to analyse together.jolla.com articles of text by its community:
- s/tba/ (total word count)
- s/tba/ (average word count per entry)
- s/tba/ (word index and word instance index and count)
You may find these locations useful to learn more about
awk
:- http://awk.info/
- http://www.catonmat.net/download/awk.cheat.sheet.txt
- http://lawker.googlecode.com/svn/fridge/share/pdf/countingWithAwk.pdf
rdmo ( 2015-01-17 01:28:23 +0200 )edit