Personal Information Protection systems can highlight things that are likely to be person's names or credit card numbers.
You could color statements that have been approved by the board (w/ metadata) or parts where it seemed the writer was angry, or something lewd/indecent, or written in Dutch, etc.
You can't put "parts of speech" tags on all words and show it to people because the whole Chomsky/Pinker approach to linguistics is a failure in terms of engineering computer systems that understand text.
A typical English sentence has thousands of valid parses according to that framework; the "most likely parse" could be wrong more than half the time. "Squad Helps Dog Bite Victim" is a good joke, but it's not funny when it happens to you, putting the wrong coloring on a sentence like that makes you look like a dork, makes the reader lose empathy with you.
For a long time (1970s) there have been "magic magic marker" methods such as hidden markov model and conditional random fields. The modern neural methods are even better than the old methods.
If you have a realistic goal and the faith and determination to mark up 20,000 or so sentences you can train a model of that sort to mark up text the way you do that people might accept -- having the training set is more essential than having the latest algorithm
Yes, I doubled checked everything but missed the typo anyways :)
Thank you for the link, I will definitely take a look.
I think what you say makes sense, but I would also note that what did not work out well in the past may work very well tomorrow. Context change, people and customa change, that may make a big difference.
We do semantic highlighting, it is called "markup", such as italic, bold or underlined text. This idea is much older then syntax highlighting and I would guess it's where computer language highlighting originates from.
Sarcasm would be nice, but I don't think that's possible.
One thing I'd like to see is the 'pillar' of the text highlighted. People often write a paragraph, page, or chapter around one proposition or argument. Whatever search engines do, it's working, but it's not used here.
Kindle does it well by informing you of commonly highlighted text.
What problem are you trying to solve? This question sounds like a solution looking for a problem... but if there is some specific problem with reading that you are trying to correct, could you let us know what that problem is?
You could color statements that have been approved by the board (w/ metadata) or parts where it seemed the writer was angry, or something lewd/indecent, or written in Dutch, etc.
You can't put "parts of speech" tags on all words and show it to people because the whole Chomsky/Pinker approach to linguistics is a failure in terms of engineering computer systems that understand text.
A typical English sentence has thousands of valid parses according to that framework; the "most likely parse" could be wrong more than half the time. "Squad Helps Dog Bite Victim" is a good joke, but it's not funny when it happens to you, putting the wrong coloring on a sentence like that makes you look like a dork, makes the reader lose empathy with you.
For a long time (1970s) there have been "magic magic marker" methods such as hidden markov model and conditional random fields. The modern neural methods are even better than the old methods.
If you have a realistic goal and the faith and determination to mark up 20,000 or so sentences you can train a model of that sort to mark up text the way you do that people might accept -- having the training set is more essential than having the latest algorithm