Male (88%), writing like Oscar Wilde (35%)

Looking into Paul Rayson’s blog and discovered an interesting link: http://www.genderanalyzer.com. It is a web form where you can put in an URL and you get an estimate whether the author of this text is male or female. For me it worked great 😉 It says that the text I wrote in my blog is with 88% written by a male. I tried it with a few more of my pages and it worked. Then I looked at some pages of some of my female colleagues and to my surprise it seems they do not write their web pages by themselves (as the program indicated 95% male writer) – they probably all have a hidden male assistant 😉

While I was in Lancaster I shared for most of the time an office with Paul. During this time I learned a lot of interesting things about corpus linguistics and phenomena in language in general – just by sharing the office. One fact at that at the time was surprising to me is that if you take 6 words from an arbitrary text in the exact order as they appear in the text and you search on the web for the exact phrase it is likely that you will only find this text. How many hits do you get for phrase “I was at Trinity College reading” in google? Try it out 😉 [to students: that is why not getting caught when you plagiarize is really hard]

From http://www.genderanalyzer.com I came to http://www.ofaust.com and to my great surprise I write like Oscar Wilde (35%) and Friedrich Nietzsche (30%). Thinking of social networks (and in particular the use of languages within closed groups) such technologies could become an interesting enabling technology for novel applications. Perhaps I should visit Paul again in Lancaster…
PS: and I nearly forgot I am a thinker / INTJ – The Scientists (according to http://www.typealyzer.com/)
PPS (2008-11-17): a further URL contrinuted from my collegues on the gender topic: http://www.mikeonads.com/2008/07/13/using-your-browser-url-history-estimate-gender/

Google chrome, secrets, the power of search engines

Lots of people downloaded Google chrome during the conference. And it seemed that google managed to keep secret till the date it is launched – they managed that before with other released… given the quality of the software that seems realy hard – or not?

How does google manage to keep its developments secret? One random though is: keeping a secret is much easier if you have control over everyone’s search engine and can decided what shows up and what not…

Just thinking of this it shows again the power the search engine company has over the user… Perhaps I should again get used to searching regularly with different search engines (e.g. http://www.cuil.com/ crawls and have their own index). Perhaps there could be a small project to create a search site that combines results from different sources (… hostory repeats… metasearch engines were popular in the 90s before altavista came along).

PS: seems that the new browser works reasonably fast and rendering is OK.

the count down started – about 5 weeks to the prototype

Yesterday our summer project started at IAIS. The students are highly motivated and the combined skill set of the participants is impressive. We discussed a lot what we want to achieve over the next weeks.

Creating a new special purpose search service – basically from the rough idea to a working prototype – in 5 weeks seems a bit crazy but I am confident that we get there 😉 In certain areas we already have an idea how much pages we have to crawl and how much content we have to analyze.

It is interesting that it already now becomes apparent that user interface issues and system architecture decisions are closely linked. E.g. doing a meta search while the user is waiting requires some other content that we can present while the user is expecting the results.

Deadline for Summer@IAIS soon

Not much time left to apply for the student research project. From 20.8. to 30.9.2007 we plan to design and implement a new specific search engine. The program is open to all computer science and media informatics students, primarily in Germany. We assume it will be very competitive. For accepted students we provide a HIWI-job at Fraunhofer IAIS for the 5 weeks. It will be possible to get credits for the course (IPEC lab course at the University of Bonn).

For more information please see: www.iais.fraunhofer.de/summer2007.html