Computers and I, or computers and me?
I’m always in search of that grand life direction that will synthesis all my interests, values, and subjects I’ve studied. Last year I discovered a field that bridges my fondness for linguistics and computers: Computational Linguistics. Its applications range from automated over-the-phone tech support (bleh), artificial intelligence stuff (meh), or smarter search engines — i.e. ways of rummaging through our modern swamp of information (hmmm). I think there is much yet to be accomplished in the design of intuitive learning tools. My goal is to get people away from traditional computer monitor and keyboard. I’d love to give someone a gadget that they could take into the woods and speak questions to about the trees or soil types. With that dream in mind, I have been working on some infant programming projects that dottle in the basics of computational linguistics, or “natural language processing”.
When us humans first learn to pick apart English, we find helpful to identify what’s a noun, what’s a verb, and maybe whether a phrase is in the past or present tense. Sometimes English teachers call this part of “mapping a sentence.” For computers that task isn’t so straight forward. A lot of research has been done on the patterns, statistics, and theoretical structure of natural language, just so a computer can determine things like whether the word record refers to a vinyl disk (one of its noun forms) or “the action of capturing sound” (verb form) when used in a sentence. This process is known as “part-of-speech tagging”, and is one of the initial steps a computer program might take towards “understanding” a phrase. While researching the current work in Natural Language Processing I stumbled upon Dr. Yoshimasa Tsuruoka’s part-of-speech tagger that he designed at the University of Tokyo’s Computer Science department. In all my compu-linguistical gittyness, I got inspired to make a graphical interface for his unix command-line program so that non-nerds could use it and get excited too. A little PHP, Javascript, and CSS later, I present The Pretty POS Tagger!
- My graphical part-of-speech tagger
- More info on the original tagger
- Wikipedia article on part-of-speech tagging
- Links to other language processing software and resources
Ashby wrote:
This is such an incredible entry! I learned so much. Are you still into Computational Linguistics? How is intuitive learning and computers tied to your interest to get people outside? What other projects have you been working that we could see?
Posted on 16-Feb-07 at 10:07 am | Permalink
Wynde Dyer wrote:
DUDE! Wow. That was the best thing ever to wake up to this morning! You’re incredible!
Look what I did with one of the text messages from the corpus:
Giant steps are what we take. Walking on the moon. I hope my legs do n’t break. Walking on the moon.
So yeah, I’ve got to get a grant to somehow pay you to help me with my thesis, methinks.
You rock!
Posted on 12-Nov-07 at 11:16 am | Permalink
Wynde Dyer wrote:
Hmm . . . actually . . . I inserted the POS tags into that message and then they went away in the post, which is strange? But whatever, your tagger tagged everything perfectly and then the internet took them away.
Posted on 12-Nov-07 at 11:18 am | Permalink