Google’s Custom Search Engine with keywords

There have been some recent developments in helping people to build their own corpora. One of these is the wikipedia corpus builder from BYU-COCA developer Mark Davies. The advantage of this system is the handy keyword extraction tool (noun, verb, adjective, adverb, noun + noun, adjective + noun) as well as the usual functions from the BYU suite of corpus tools.

The disadvantage is that it is limited to the texts from wikipedia.

An alternative is to use Google’s Custom Search Engine (CSE) which allows you to add in the websites you are interested in and more recently (well recent for me) they have added a creation from keywords tool.

I have previously used the CSE in building a Mission to Mars search engine that contained urls with information related to the NASA mission to Mars. This however required some effort in first finding relevant urls.

Now with the keyword tool this is a much easier process.

I will describe how to do this using the example of setting up a Bridge Building corpus that I want to use with students. The task was to watch a video on a mega-bridge, take notes, then write a summary. I want to use the CSE to get students to look up “incorrect” language use in their summary and revise them using information from the CSE.

As mentioned I have done this previously with the Mission to Mars CSE but since it took some fiddling to build I had never really followed up on that. Now with the relative speed of keyword search this is something I want to try again.

Note that I will detail how to use the CSE in detail in another post, suffice to say now is that use of simple Google searching like double quotation marks and wildcard asterix goes a long way.

Note you do need a Google account.

When you create a new search engine you get this screen:

CSE-keyword-screen1

the Use the CSE creation from keywords tool is circled in red.

Clicking on that takes you to this screen which asks you to name your Search Engine:

CSE-keyword-screen2

Then the following screen is presented where you fill in your relevant keywords:
CSE-keyword-screen3

Here is the screen with keywords I entered for my Bridge Building CSE:

CSE=keyword-screen3-filled

Then you press Expand keywords and you get a screen like this:

CSE-keyword-screen3-filled-expand

These are the words Google will use to retrieve relevant urls. Note that at this step it is advisable to check the keywords and if not relevant then use the add negative items feature (circled in red).

As you can see my initial keyword of center span came up with words related to html web programming and design and so I needed to eliminate such words in a cycle of adding to negative terms and seeing the resulting search words.

Then once you are happy with this press the Generate URL Patterns button shown circled in the next screenshot:
Generate_URL_patterns

As a result of the previous command you get a list of urls that you can check for relevancy before adding them to your custom search engine:

Gen_URL-results

Each CSE you build has a public url you can share with students here is mine for the Bridge Building CSE.

Thanks for reading and do feel free to fire any questions and comments below.

NB1:

Rudy Loock ‏@RudyLoock found that if you are signed into Google with non-English settings you may need to change language settings to English in order to see the Use the CSE creation from keywords tool.

NB2:

Here is a link to an article about using Google Search and Google Scholar in the Humanising Language Teaching magazine – Google Giveth and Google Taketh, Developments in Google as a Corpus.

2 thoughts on “Google’s Custom Search Engine with keywords

Penny for your thoughts

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s