General
General
Who are you?
Our names are Kris and Ryan. We are software engineers from Australia.
Why did you make this site?
We have been long time admirers of Wordle. We both also work (at time of writing) for an Australian text analytics company called Leximancer. We have yearned for a Wordle that uses modern text analytics in its visualisation instead of just word frequency to help understand the text while also making it pretty. We also thought there might be other visualisations that could help achieve this goal. Towards the end of 2012 all of the relevant technologies seemed to exist, so we took the plunge and this is the result.
Will this site remain free?
In short, yes. The longer answer is that it is extremely unlikely that we will reduce the number of features that are free. In fact, we are more likely to increase them. But we also haven't ruled out providing premium features.
Are you interested in a partnership/business/licensing deal?
Possibly. We definitely aren't steadfastly against it. Try contacting us.
Technical
How are you different to Wordle?
Wordle takes your words and turns it into what most consider to be art. It creates stunning visuals that are great to look it. It does this by using word frequency - counting words. Much more advanced textual analytics are presently available. They vary wildly in their approach from using hardcoded/fixed taxonomies to completely automatic solutions based on observed statistics.
We thought it would be great to use some of the recent innovations in text analytics together with advances in web technologies to put together a modern text visualisation website. A site that is based entirely on modern web technologies and text analytics (provided by Leximancer in this case).
There are a few key differences between textisbeautiful.net (TIB) and wordle.net (Wordle). Firstly, TIB deals in concepts, not words. A concept is a collection of words that is automatically discovered from the text. This is at the crux of what Leximancer does. You should take the time to read the white paper published by Leximancer for a more detailed explanation of this idea.
Additionally, TIB provides more visualisations than just the concept cloud (our concept cloud is visually similar to the Wordle output but uses more advanced text analytics underneath). Our concept cloud uses colour coding to convey relatedness between concepts (concepts that have the same colour are closely related) and also provides an exploded view (concept web) that uses positioning as well as colour. We also provide other visualisations and we hope to add more as we go. We even hope to open up the ability to add visualisations to the site so users can participate.
Err, a concept? Say what?
A lot of people struggle with the difference between a word and a concept. Leximancer provides a lot of material on their website about their software. Take the time to look through it. But in the spirit of trying to give you a quick understanding, here is a rudimentary example:
Lets suppose you are reading Alice in Wonderland. This is the Wordle for the Alice in Wonderland book:
Now compare it to the TIB concept cloud:
Firstly, you'll notice that there are far more words on the Wordle image then there are concepts on the TIB concept cloud image. That is because a concept is made up of several words - each concept has it's own thesaurus that is discovered from the text. Lets take the concept "March Hare" for example. You will notice the words "March" and "Hare" on the Wordle image (they are on the right side, towards the top, not together). Leximancer has decided that "March Hare" should be its own concept, so they appear together in the same colour. The same is true of Mock Turtle and White Rabbit. The thesaurus for the concept March Hare is made up of the terms: March Hare, butter, raving, twinkled. The full list of terms is much longer, but you get the gist. Leximancer has detected through the use of statistical analysis that whenever you see these terms in the text, they are evidence of the "March Hare" concept. In reality, the presence of a single term from the thesaurus in the text generally isn't enough evidence - you need multiple terms to appear close by for the concept to be present.
Secondly, you'll notce that a lot of concepts in the TIB image share the same colour. This is because these concepts are highly related - they appear together often in the text. Having related concepts sharing the same colour helps you to understand the text. Make sure you look at the green concepts in the concept cloud. Do you remember at the start of the book when Alice has to eat a round cake?
Finally, the TIB concept cloud has a related visualisation called the concept web (see below). This layout is similar to the Leximancer concept map. Instead of trying to minimise white space, this layout encourages it when two concepts aren't related and discourages it when they are. In this layout, each concept exerts a pull on every other concept. How powerful that pull is depends on how related they are. You'll also notice the grey lines connecting concepts. Try following them to get an insight into the story (for example heard -> saying -> words).
Remember, this is just two of an ever increasing number of visualisations available. Go check the rest out for yourself, they all come with a detailed description.
Can I download the visualisation I created as an image?
Yes, you certainly can. In fact, we encourage it. Download the image, share it, use it in your work. Although attribution isn't required, it is definitely appreciated. The images generated by this site are covered by the Creative Commons Attribution 3.0 Unported License.
Can I share it using <insert social network name here>?
We have built in support for Twitter, Google+ and Facebook. You can share it anywhere else you like by downloading the image directly from here and sharing it wherever you like.
Can you make it available as a web service?
Possibly, it's something we will consider in the future. It might not be a free service though. If you really want to make your own visualisations like we have done here then you should contact Leximancer and get access to their software. This website is open source but Leximancer is a commercial product.
Do you support languages other than English?
Not yet, but we hope to very soon! Stay tuned for announcements on our blog!