Women Writers Vector Toolkit
  • Word Vector Interface
  • About
    About the WWVT How to Navigate Team
  • Resources
    Introduction Glossary Case Studies Helpful Sources
  • Teaching Guide
    Teaching with this Tool Assignments
  • Methodology



Download


  • Home
  • Compare
  • Clusters
  • Operations
  • Visualization

Welcome to the Women Writers Vector Toolkit (WWVT) discovery interface! This interface will allow you to query terms in word2vec models that were trained on texts from the Women Writers Online, Victorian Women Writers Project, and Early English Books Online–Text Creation Partnership collections.

To get started, type a word in the “Query term” box. The results that appear are the words that are closest to the term that you queried in vector space—that is, words that appear in similar contexts in the corpus used to train your model.

On the left-hand sidebar, you can select different models to query; you can also increase the number of words in your results set. More ways to explore these models can be accessed under the “Compare,” “Clusters,” “Operations,” and “Visualization” tabs above.

If you click on any individual term, a new page will take you to the Women Writers Online interface (subscription required; see this page for information on subscribing and setting up a free trial) to search for your term in the WWO collection.

Clusters are generated based on neighboring words in vector space—words that are used in similar contexts will be clustered together. Each column represents a different cluster, randomly selected from 150 total clusters; the words in the list are those closest to the center of the cluster.

Use the dropdown on the left to select which model you want to view. Click the “Download” button to download the set of clusters you are viewing. You can also hit the “reset clusters” button to see a new set of clusters and use the slider to see more terms from each cluster. (Note that adjusting the number of terms per cluster will also reset the clusters.)

If you click on any individual term, a new page will take you to the Women Writers Online interface (subscription required; see this page for information on subscribing and setting up a free trial) to search for your term in the WWO collection.


Using the sidebar on the left, you can select from several different operations and choose which model you would like to query.





Addition allows you to add the contexts associated with two terms to each other, while subtraction allows you to subtract the contexts associated with one word from another. To see how these work, try “orange” + “red” and “orange” - “red” and compare the results.

The analogies operation allows you to subtract the contexts associated with one term from another, and then add the contexts associated with a third term. For example, you might subtract “man” from “woman” to get a vector associated with the contexts of “woman” as distinct from “man”; then, adding the vector for “king” will bring in its contexts to give you words associated with the distinction between woman and man AND with royalty; in many models, this will be “queen.” Or, put more simply: woman - man + king = queen; woman is to man as queen is to king.

The advanced option allows you to create a query of your own using multiple operations.

If you click on any individual term, a new page will take you to the Women Writers Online interface (subscription required; see this page for information on subscribing and setting up a free trial) to search for your term in the WWO collection.

The visualizations tab allows you to create a word cloud for the query term you would like to analyze. The word cloud will produce a collage of the most similar words to your query term using the WWO general corpus model. You can adjust the visualization based on the number of words you would like to see appear (top slider bar on the left of this page). These terms are based on their percentage of similarity to the query term. The similarity percentage is also represented in the visualization by the color of each word. See below for the color key. The second slider down from the similarity bar will allow you to adjust the number of words you would like in your word cloud, and the bottom-most slider controls the size of the plot image.

Similarity Color Key
Similarity % -- Color
91 – 100 -- gray
81 – 90 -- brown
71 – 80 -- orange
51 – 70 -- green
00 – 50 -- pink