General

Who are you?

Our names are Kris and Ryan. We are software engineers from Australia. After the success of this site we have joined forces with some other industry leaders to form Kapiche.

Why did you make this site?

We have been long time admirers of Wordle. We have also been in the text analytics industry for over 6 years. We have yearned for a Wordle that uses modern text analytics in its visualisation instead of just word frequency to help understand the text while also making it pretty. We also thought there might be other visualisations that could help achieve this goal. Towards the end of 2012 all of the relevant technologies seemed to exist, so we took the plunge and this is the result.

Will this site remain free?

In short, yes. The longer answer is that it is extremely unlikely that we will reduce the number of features that are free. In fact, we are more likely to increase them. But we are also busy working on a couple of paid tiers for this site with premium features such as saving projects, customising settings, uploading files etc.

Are you interested in a partnership/business/licensing deal?

Yes, very much so. Try contacting us.

What's next?

This site started as a hobby - something we did in our free time, put online and let it run. We've been blown away by the attention it's got (see the tweet by Ashley Jubb here and the blog post from Wordle here!) particularly because we have put zero marketing effort in. Based on the success of TIB and the constant stream of messages we've had from our users we have decided to start Kapiche. As part of that endeavour, we are rebuilding this site to be much more powerful and useful. Don't worry, the free tier will remain! Stay tuned for more details.

Technical

How are you different to Wordle?

Wordle takes your words and turns it into what most consider to be art. It creates stunning visuals that are great to look it. It does this by using word frequency - counting words. Much more advanced textual analytics are presently available. They vary wildly in their approach from using hardcoded/fixed taxonomies to completely automatic solutions based on observed statistics.

We thought it would be great to use some of the recent innovations in text analytics together with advances in web technologies to put together a modern text visualisation website. A site that is based entirely on modern web technologies and text analytics.

There are a few key differences between textisbeautiful.net (TIB) and wordle.net (Wordle). Firstly, TIB deals in concepts, not words. A concept is a collection of words that is automatically discovered from the text. This is at the crux of what our text analytics does.

Additionally, TIB provides more visualisations than just the concept cloud (our concept cloud is visually similar to the Wordle output but uses more advanced text analytics underneath). Our concept cloud uses colour coding to convey relatedness between concepts (concepts that have the same colour are closely related) and also provides an exploded view (concept web) that uses positioning as well as colour. We also provide other visualisations and we hope to add more as we go. We even hope to open up the ability to add visualisations to the site so users can participate.

Err, a concept? Say what?

A lot of people struggle with the difference between a word and a concept. In the interest of trying to give you a quick understanding, here is a rudimentary example:

Lets suppose you are reading Alice in Wonderland. This is the Wordle for the Alice in Wonderland book:

Now compare it to the TIB concept cloud:

Firstly, you'll notice that there are far more words on the Wordle image then there are concepts on the TIB concept cloud image. That is because a concept is made up of several words - each concept has it's own thesaurus that is discovered from the text. Lets take the concept "March Hare" for example. You will notice the words "March" and "Hare" on the Wordle image (they are on the right side, towards the top, not together). Our technology has decided that "March Hare" should be its own concept, so they appear together in the same colour. The same is true of Mock Turtle and White Rabbit. The thesaurus for the concept March Hare is made up of the terms: March Hare, butter, raving, twinkled. The full list of terms is much longer, but you get the gist. Our technology has detected through the use of statistical analysis that whenever you see these terms in the text, they are evidence of the "March Hare" concept. In reality, the presence of a single term from the thesaurus in the text generally isn't enough evidence - you need multiple terms to appear close by for the concept to be present.

Secondly, you'll notice that a lot of concepts in the TIB image share the same colour. This is because these concepts are highly related - they appear together often in the text. Having related concepts sharing the same colour helps you to understand the text. Make sure you look at the green concepts in the concept cloud. Do you remember at the start of the book when Alice has to eat a round cake?

Finally, the TIB concept cloud has a related visualisation called the concept web (see below). Instead of trying to minimise white space, this layout encourages it when two concepts aren't related and discourages it when they are. In this layout, each concept exerts a pull on every other concept. How powerful that pull is depends on how related they are. You'll also notice the grey lines connecting concepts. Try following them to get an insight into the story (for example heard -> saying -> words).

Remember, this is just two of an ever increasing number of visualisations available. Go check the rest out for yourself, they all come with a detailed description.

Can I download the visualisation I created as an image?

Yes, you certainly can. In fact, we encourage it. Download the image, share it, use it in your work. Although attribution isn't required, it is definitely appreciated. The images generated by this site are covered by the Creative Commons Attribution 3.0 Unported License.

Can I share it using <insert social network name here>?

We have built in support for Twitter, Google+ and Facebook. You can share it anywhere else you like by downloading the image directly from here and sharing it wherever you like.

Can you make it available as a web service?

Yes! We are working feverishly on this. Head on over to the Kapiche website and get in contact with us for more information!

Do you support languages other than English?

Not yet, but it's coming! Stay tuned for announcements on our blog!

© 2012-2014 Kapiche Limited. Back to Top