This page is outline for the DH 2014 Workshop My Very Own Voyant: From Web to Desktop Application (PDF).
Where we introduce ourselves and our experience with Voyant-Tools.
- Have you used Voyant-Tools?
- What do you hope to learn?
- Outline of what we will accomplish
1. Installing and running VoyantServer locally
Where you install VoyantServer on your laptop and run it.
- To download VoyantServer click on this link: http://dev.voyant-tools.org/downloads/current/VoyantServer.zip
- For instructions on installing and running VoyantServer see http://docs.voyant-tools.org/resources/run-your-own/voyant-server/ (this document contains more detailed information on VoyantServer)
- If you receive a USB key, you may want to first transfer the contents to your hard drive for better performance
- You can now use Voyant by running VoyantServer, which should automatically launch the URL http://127.0.0.1:8888/
Note: It’s best to click the Stop Server button in VoyantServer after you are finished or you may leave processes running.
2. A brief tour of Voyant
Where those unfamiliar with Voyant get a brief tour.
- Try just a single tool. Go to http://127.0.0.1:8888/tool/Cirrus/ and load a text. Eg. http://rss.cbc.ca/lineup/topstories.xml (this is the CBC top news stories.)
- Try clicking on a word.
- When using Voyant you can upload a text or enter a URL.
- You can read about the Cirrus tool here.
- Explore the default reading interface. Go to http://127.0.0.1:8888/
- Export a URL of your corpus in Voyant so you can access it again.
3. VoyantServer settings
Where you learn about controlling VoyantServer.
Managing data (corpus indices)
- Where are corpus indices cached? By default data is stored in a temporary directory that is specified by your operating system. Data in that directory should persist when you start and stop VoyantServer, but it may be cleaned out by your operating system when you restart your machine.
- The easiest way to specify an alternate location, one where the data are more likely to survive a machine restart, is to create a new, empty directory called data in the same folder as where VoyantServer.jar is located.
- You can also set another location for your data by providing a path to an existing parent folder in the server-settings.txt file. Here is an example:
data_directory = /Users/grockwel/Documents/VoyantServerData
Modifying a Corpus ID/Name
Voyant Tools automatically assigns an ID/name to a corpus, a generated value like 1404362954425.942. After creating a corpus you can see the id/name by clicking the Export icon and producing a URL.
You can change the corpus id/name, but it should be considered an advanced operation. There are two steps:
- In your data directory there’s a folder named trombone3_0 which contains individual folders for each corpus. The first step is to find the folder that corresponds to your corpus and rename the folder to the new corpus id/name that you wish to use (it’s best to use a reduced character set such as alphanumeric characters, dots and hyphens)
- In the corpus folder there’s a file named corpus-metadata.xml – open it with an XML or text editor and modify the entry that is below the id entry, near the top of the file. Save the file, and the corpus should now be available with the new corpus id/name in the URL.
Handling large corpora
- For larger copora (>10 MB) you can increase the memory of VoyantServer (set the value in megabytes such as 1024, 2048, 4096, etc.). Remember to stop and restart.
- Using VoyantServer on confidential information.
- How to make sure VoyantServer can’t be accessed.
5. Setting up a public server
- How to run VoyantServer for others.
- Deploying as a Tomcat application.
Deploying as a Tomcat application
VoyantServer ships with a compliant Java Servlet web application that can be deployed under different servlet containers, such as Apache Tomcat. Here are some steps:
- download Tomcat (like the core version of Tomcat 7 – the tar.gz file is recommended for Mac to preserve executable file permissions)
- uncompress the archive (usually double-clicking on the file)
- copy the _app folder from VoyantServer to the webapps folder in the Tomcat folder (see image below) – be sure to copy and not move the folder (on Mac you can hold the option key while dragging the folder)
- rename the _app folder to voyant
- it’s actually not necessary to rename the folder, but then the URL would be something like http://127.0.0.1/_app
- you can also make the application run in the root of the server by deleting the existing ROOT folder and renaming _app to ROOT – the the URL is something like http://127.0.01/
- now start Tomcat by running bin/startup.sh from the Tomcat folder (this is typically done on the command-line, you can read the RUNNING.txt file in the Tomcat folder for more information)
- usually you can then visit http://127.0.0.1/voyant
By default the data will be stored in the temp directory in the Tomcat folder (which, despite its name, shouldn’t disappear during Tomcat or machine startup). This and many other settings can be tweaked, but it’s best to look at the Tomcat documentation.
6.0 Exporting and Skinning
If we have time we will now show you how you can experiment with other Voyant Tools like ResoViz.
- After loading a text (like http://rss.cbc.ca/lineup/topstories.xml) you can choose the Export button and use “a URL for a different tool/skin and current data” – You can now experiment with different tools not available in the standard skin. Try ResoViz.
- You can also try a different skin. Experiment with the skins and try the Skin Builder.
7.0 For After the Workshop
Try Voyant on a text or corpus of your own after the workshop.
- Find or assemble a text of your own.
- Try studying it with Voyant.
- Experiment some more with the advanced features like the Exporting to a different skin. Try opening your corpus in the Skin Builder and developing your own skin.
- We are developing a version called Voyant Notebooks that has a literate programming interface where you can program with Voyant. This will allow you to keep a notebook of your analysis.
Staying in touch
If you want to be kept up to date on VoyantServer you can:
- Ask to be added to a Google group to which we will send occaisional posts: firstname.lastname@example.org (Note: this is a broadcast list not a discussion list.)
- You can follow Voyant on Twitter @VoyantTools
To find and clean texts see:
- Internet Archive texts: https://archive.org/details/texts
- U of Virginia: http://etext.lib.virginia.edu/collections/subjects/
- Gutenberg: http://www.gutenberg.org/
- Arts-Humanities.net: http://arts-humanities.net/
- DRAPier: http://dho.ie/drapier/
Aggregating and Cleaning Texts:
- TAPoRware Aggregator: http://taporware.ualberta.ca/~taporware/otherTools/aggregator.shtml
- TAPoRware Cleaner: http://taporware.ualberta.ca/~taporware/betaTools/webcleaner.shtml
8.0 Other Tools
What other tools are there out there? See TAPoR 2.0 for a growing list of tools.
- TAPoRware Tools mentioned above: http://taporware.ualberta.ca/
- AntConc (local App) http://www.antlab.sci.waseda.ac.jp/software.html
- Berkeley Wordseer http://wordseer.berkeley.edu/
- WordHoard (Java Webstart) http://wordhoard.northwestern.edu/
- Many Eyes (Visualization) http://www-958.ibm.com/software/data/cognos/manyeyes/
- HyperCities http://hypercities.ats.ucla.edu/
- R (programming language) http://cran.r-project.org/index.html
- Mathematica http://www.wolfram.com/
- Monk Project http://monkproject.org/
- Seasr http://seasr.org/
- Alchemy API http://www.alchemyapi.com
- Illinois Named Entity Recognizer Demo http://cogcomp.cs.illinois.edu/demo/ner/?id=8