General

Who are you?

Our names are Kris and Ryan. We are software engineers from Australia. After the success of this site we have joined forces with some other industry leaders to form Kapiche.

Why did you make this site?

We have been long time admirers of Wordle. We have also been in the text analytics industry for over 6 years. We have yearned for a Wordle that uses modern text analytics in its visualisation instead of just word frequency to help understand the text while also making it pretty. We also thought there might be other visualisations that could help achieve this goal. Towards the end of 2012 all of the relevant technologies seemed to exist, so we took the plunge and this is the result.

Will this site remain free?

In short, yes. The longer answer is that it is extremely unlikely that we will reduce the number of features that are free. In fact, we are more likely to increase them. But we are also busy working on a couple of paid tiers for this site with premium features such as saving projects, customising settings, uploading files etc.

Are you interested in a partnership/business/licensing deal?

Yes, very much so. Try contacting us.

What's next?

This site started as a hobby - something we did in our free time, put online and let it run. We've been blown away by the attention it's got (see the tweet by Ashley Jubb here and the blog post from Wordle here!) particularly because we have put zero marketing effort in. Based on the success of TIB and the constant stream of messages we've had from our users we have decided to start Kapiche. As part of that endeavour, we are rebuilding this site to be much more powerful and useful. Don't worry, the free tier will remain! Stay tuned for more details.

Technical

How are you different to Wordle?

Wordle takes your words and turns it into what most consider to be art. It creates stunning visuals that are great to look it. It does this by using word frequency - counting words. Much more advanced textual analytics are presently available. They vary wildly in their approach from using hardcoded/fixed taxonomies to completely automatic solutions based on observed statistics.

We thought it would be great to use some of the recent innovations in text analytics together with advances in web technologies to put together a modern text visualisation website. A site that is based entirely on modern web technologies and text analytics.

There are a few key differences between textisbeautiful.net (TIB) and wordle.net (Wordle). Firstly, TIB deals in topics, not words. A topic is a collection of words that is automatically discovered from the text. This is at the crux of what our text analytics does.

Additionally, TIB provides more visualisations than just the topic cloud (our topic cloud is visually similar to the Wordle output but uses more advanced text analytics underneath). Our topic cloud uses colour coding to convey relatedness between topics (topics that have the same colour are closely related) and also provides an exploded view (topic web) that uses positioning as well as colour. We also provide other visualisations and we hope to add more as we go. We even hope to open up the ability to add visualisations to the site so users can participate.

Err, a topic? Say what?

A lot of people struggle with the difference between a word and a topic. In the interest of trying to give you a quick understanding, here is a rudimentary example:

Lets suppose you are reading Alice in Wonderland. This is the Wordle for the Alice in Wonderland book:

Now compare it to the TIB topic cloud:

Firstly, you'll notice that there are far more words on the Wordle image then there are topics on the TIB topic cloud image. That is because a topic is made up of several words - each topic has it's own thesaurus that is discovered from the text. Lets take the topic "March Hare" for example. You will notice the words "March" and "Hare" on the Wordle image (they are on the right side, towards the top, not together). Our technology has decided that "March Hare" should be its own topic, so they appear together in the same colour. The same is true of Mock Turtle and White Rabbit. The thesaurus for the topic March Hare is made up of the terms: March Hare, butter, raving, twinkled. The full list of terms is much longer, but you get the gist. Our technology has detected through the use of statistical analysis that whenever you see these terms in the text, they are evidence of the "March Hare" topic. In reality, the presence of a single term from the thesaurus in the text generally isn't enough evidence - you need multiple terms to appear close by for the topic to be present.

Secondly, you'll notice that a lot of topics in the TIB image share the same colour. This is because these topics are highly related - they appear together often in the text. Having related topics sharing the same colour helps you to understand the text. Make sure you look at the green topics in the topic cloud. Do you remember at the start of the book when Alice has to eat a round cake?

Finally, the TIB topic cloud has a related visualisation called the topic web (see below). Instead of trying to minimise white space, this layout encourages it when two topics aren't related and discourages it when they are. In this layout, each topic exerts a pull on every other topic. How powerful that pull is depends on how related they are. You'll also notice the grey lines connecting topics. Try following them to get an insight into the story (for example heard -> saying -> words).

Remember, this is just two of an ever increasing number of visualisations available. Go check the rest out for yourself, they all come with a detailed description.

Can I download the visualisation I created as an image?

Yes, you certainly can. In fact, we encourage it. Download the image, share it, use it in your work. Although attribution isn't required, it is definitely appreciated. The images generated by this site are covered by the Creative Commons Attribution 3.0 Unported License.

Can I share it using <insert social network name here>?

We have built in support for Twitter, Google+ and Facebook. You can share it anywhere else you like by downloading the image directly from here and sharing it wherever you like.

Can you make it available as a web service?

Yes! We are working feverishly on this. Head on over to the Kapiche website and get in contact with us for more information!

Do you support languages other than English?

Not yet, but it's coming! Stay tuned for announcements on our blog!

© 2012-2014 Kapiche Limited. Back to Top