Welcome to the Women Writers Vector Toolkit (WWVT) discovery interface! This interface will allow you to query terms in word2vec models that were trained on texts from the Women Writers Online, Victorian Women Writers Project, and Early English Books Online–Text Creation Partnership collections.
To get started, type a word in the “Query term” box. The results that appear are the words that are closest to the term that you queried in vector space—that is, words that appear in similar contexts in the corpus used to train your model.
On the left-hand sidebar, you can select different models to query; you can also increase the number of words in your results set. More ways to explore these models can be accessed under the “Compare,” “Clusters,” “Operations,” and “Visualization” tabs above.
If you click on any individual term, a new page will take you to the Women Writers Online interface (subscription required; see our licensing page for information on subscribing and setting up a free trial) to search for your term in the WWO collection.
If the query term that you select is not in the vocabulary of your chosen model, the word list generated will be a list of stop and filler words.
To get started, type a word in the “Query term” box. The results that appear are the words that are closest to the term that you queried in vector space across two models. You can select which models to query by using the sidebar on the left as well as increase the number of words in your results set for both.
If you click on any individual term, a new page will take you to the Women Writers Online interface (subscription required; see this page for information on subscribing and setting up a free trial) to search for your term in the WWO collection.
Clusters are generated based on neighboring words in vector space—words that are used in similar contexts will be clustered together. Each column represents a different cluster, randomly selected from 150 total clusters; the words in the list are those closest to the center of the cluster.
Use the dropdown on the left to select which model you want to view. You can also hit the “reset clusters” button to see a new set of clusters and use the slider to see more terms from each cluster. Click the “Download” button to download the full set of 150 clusters for the current model.
If you click on any individual term, a new page will take you to the Women Writers Online interface (subscription required; see this page for information on subscribing and setting up a free trial) to search for your term in the WWO collection.
For more information on how the clusters were derived and what they show, see our guide to the Shiny app.
Using the sidebar on the left, you can select from several different operations and choose which model you would like to query.
Addition allows you to add the contexts associated with two terms to each other, while subtraction allows you to subtract the contexts associated with one word from another. To see how these work, try “orange” + “red” and “orange” - “red” and compare the results.
The analogies operation allows you to subtract the contexts associated with one term from another, and then add the contexts associated with a third term. For example, you might subtract “man” from “woman” to get a vector associated with the contexts of “woman” as distinct from “man”; then, adding the vector for “king” will bring in its contexts to give you words associated with the distinction between woman and man and with royalty; in many models, this will be “queen.” Or, put more simply: woman - man + king = queen; woman is to man as queen is to king.
The advanced option allows you to create a query of your own using multiple operations.
If you click on any individual term, a new page will take you to the Women Writers Online interface (subscription required; see this page for information on subscribing and setting up a free trial) to search for your term in the WWO collection.
The Visualizations tab provides some experimental visualizations for exploring query terms and their contexts. You can change models with the “Model” dropdown menu in the sidebar. The “Select visualization” dropdown menu allows you to choose different visualization options.
The Word Cloud visualization offers a spatial view of the closest words to a query term. The query term you enter appears in the center of the cloud, surrounded by terms whose proximity is based on cosine similarity to the input term. Terms with higher cosine similarities are closer to the center. The terms are also color-coded to show cosine similarity, as outlined in the key below. In the sidebar, the top slider allows you to set a threshold for the cosine similarity of displayed terms. The middle slider allows you to control the maximum number of words displayed, and the bottom slider controls the size of the plot image.
Similarity Color Key
Similarity — Color
- 0.81 – 1.0 — black
- 0.61 – 0.80 — blue
- 0.41 – 0.60 — red
- 0.21 – 0.40 — purple
- 0 – 0.20 — orange
This tab allows you to plot the words closest to a pair of query terms, according to cosine similarity. To see how it works, input two terms in the two boxes. You can use the slider to control the maximum number of words to plot.
For example, try plotting "happy" and "sad." You'll see that some terms are close to "happy" but far from "sad"; some terms are close to "sad" but far from "happy"; and some terms are close to both "happy" and "sad".