Immediate Data: The World Wide Web as a Resource for Teaching Research Methods

Lynne M. Pachnowski, Ph.D.
University of Akron, Akron, OH

Isadore Newman, Ph.D.
University of Akron, Akron, OH

Joseph P. Jurczyk, M.B.A.



Abstract

The World Wide Web provides a convenient source of databases and examples of survey resources for those who teach research methods. For example, lottery numbers, traffic patterns, and surveys and their results can be found on the World Wide Web. These can stimulate discussion and lead to student analyses completed with immediate data. This paper presents a some examples of these sources and offers suggestions regarding their applications in the classroom.

The myriads of databases and data-like information on the World Wide Web are used by people in business and industry every day who need to taylor the information to answer their specific questions. It would appear logical that students learning statistics would benefit from the same medium in learning their subject. The Internet not only provides a huge source of data -- raw and otherwise formatted -- but also provides the authentic learning experience that would be applicable to other experiences after the course is over. Furthermore, typically neither student nor instructor need go far from the classroom to experience this authentic application.

This paper will present some of these Internet sites that may be used in a research methods course and suggest ways in which they may be integrated into the class discussion and activities. However, the examples and suggestions given reflect only the "tip of the iceberg". These examples are meant to be used within a class, to stimulate the research instructor to modify the information to fit his / her course, or to encourage instructors to locate sites that might better fit the content and methodology of a course. An instructor can use the data at these sites to demonstrate to the students many of the concepts found in the course text. Using the Web locations, students and instructors together can hypothesize and then investigate their conjectures within the environment of the classroom and / or the computer lab, yet the data is real as well as the results. Without collecting any new data, but simply using what is available to anyone with a computer and a modem, students can practice the questioning and investigate skills of a researcher under the guidance of their instructor.

The body of this paper contains a discussion of actual Internet Web sites, their addresses, a short description of the site, and finally a discussion of its potential research methods course application. The paper concludes with a discussion of the limitations of such a process and a bibliography of cyber and non-cyber resources that readers may investigate on their own.

I. Examples of Sites and Their Instructional Uses

Random Data:

  • The Texas Lotto - http://crashdummy.iglobal.net/lotto

    The Web page for this site contains the "Pick 6" winning numbers of the last Texas drawing.

    Choosing "Frequency of Numbers Drawn" yields a page which states that the webmaster chose to drop the frequency tables so that updates could be done more quickly and since some objected to them "because they imply that, contrary to the science of probability, future drawings could be predicted through their use." In class, this page could generate a good deal of discussion regarding the use of frequency tables in lottery situations and whether they could, actually, be use to predict the outcome of a lottery? Why would some numbers occur more frequently than others? The page goes on to state that these can be derived from the history tables. How can this be done?

    By choosing "Lotto of Drawing History", the history tables appear. This page contains tables containing the drawing date (from most recent to least), the winning "Pick 6" numbers, and tables containing the numbers of tickets matching six out of six, five out of six, etc., and the prize dollar amount.

    This raw data can be used in various ways. The casual viewer may simply scan the information or choose to jot down previous winning ticket numbers. The statistics student, however, can copy this information onto the computer's clipboard and paste it into an application program such as a word processing document, spreadsheet, or datafile. (Copying the data from the html source file -- "document source" in Netscape -- may help to retain some of the table formatting.) After deleting unneeded information and possibly applying some formatting changes, the data is ready to analyze. Students can prepare research questions and predict results. Questions may include: What is the most frequently occurring number or numbers? What are the most frequently occurring numbers for winning tickets? What is the correlation between the number of winning numbers matched and the prize amount?

    Actual Data and Data-Filled Documents:

  • U.S. Census Bureau - http://www.census.gov

    This Web page contains buttons which will link the user to a number of informative pages regarding census information. The buttons are labeled: Selecting "Current U.S. Population Count" reveals a page that actually provides two options: "United States" and "World". Selecting "United States" connects to a page showing that the resident population projected to the date and time this author visited the page was 266,620,189. The page also provides component settings that generated this number including birth, death, and migrant rates. Another hypertext link will reveal to the user the documentation for these projections, if he wishes to read about these. Selecting "World" yields the current world population, the monthly world population figures for the last year, and links to the related tables: "World Population: 1950 to 2050", "Historical Estimates of World Population", and "World Vital Events Per Time Unit: 1997". Student questions that these pages generate include: How were these projections calculated? Knowing only the monthly world population figures for the past year, how could one predict the population for February, 2001, for example. The exercises could lead to discussions of lines and functions of best fit. Using these functions and lines, past year estimates can be compared to actual figures.

    The World Population page will link the user to the "International Database". Once reaching the IDB, the user is given options of ways to access to data. "Display" will allow the user to look at the data on the screen or print it, "Spreadsheet" will allow the user to load the data into a spreadsheet", and "User Configurable" allows the user to control many aspects of the appearance of the output. Once choosing the output format, the user is prompted to select one of a number of tables such as "Infant Mortality Rates and Life Expectancy at Birth, by Sex" or "Urban Population as a Percent of Total Population". The user than selects one or more countries and one or more years. Once the parameters are submitted, the resulting table appears. Again, the table may be printed or copied into another format, depending on what was selected in the previous options.

    If the results of this search lead to more questions than answers, the user may wish to return to the home page where there is a hypertext link entitled, "Other Official Statistics" which links to "International Statistical Agencies". This page contains links the national statistical bureaus of forty-two countries and four international organizations.

    Back at the home page, a user can be linked to "CenStats/CenStore". CenStats is a database containing more than 1,000 Census Bureau publications containing statistical information such as the population, housing, state and local governments and education. The document, 1996 Statistical Abstract of the United States, for example, is available in this database as of November, 27, 1996. Section four of this document is "Education". If a user wishes to view this document, he/she reads it via Adobe's Acrobat, which is free and able to be downloaded through on-screen prompts. CenStore contains descriptions and ordering information for Census Bureau products such as CD-ROM's, maps, etc.

    There are other sites besides the census site for finding government-related data. However, some are restricted and/or require a subscription. For instance, STAT-USA, located at http://www.stat-usa.gov, contains the National Trade Data Bank. the Economic Bulletin Board, the Global Business Procurement Opportunities, and the Bureau of Economic Analysis Economic Information. However, accessing information from these databases requires a subscription which can be obtained from the home page. Students learning how to obtain data via the Internet need to be aware of this potential limitation to their searches.

    Search Engines Yielding Selected Populations:

  • CollegeNET - http://www.collegenet.com/lists/list.html

    This Web page is designed to assist potential college students in finding a college that meets their needs. However, it may also assist master's and doctorate students in finding an appropriate sample from a college that contains a preferred population.

    The home page of this source has an option entitled, "College Search". Selecting this, the user finds a menu of choices: "Four Year U.S. College", "Four year schools by state map", "Community, Technical, and Junior Colleges", "Schools in Canada", and "Schools in New Zealand". Once choosing "Four Year U.S. Colleges", the user finds a search engine where the user sets parameters such as state/region, enrollment, tuition, sports, and majors offered. The search will return a list of schools that fit those categories chosen as well as hypertext links to the schools.

    Back at the home page, the user may select the option "lists". This will result in a page containing the following categories: "Research One Universities", "Catholic Colleges and Universities", "Schools with 1994 Rhodes Scholar Recipients", "Ivy League Schools", "Women's Schools", and "Historically Black Colleges and Universities".

    Although these pages were probably primarily designed for high schools seniors, potential graduate students, or guidance counselors to assist them in locating appropriate institutions to attend, this type of data may also be used by statistics students and students completing research in order to discuss and test for representativeness of data. Once data is obtained from students in one's own educational institution, many students have difficulty in seeing how this data may be generalized. By determining in what categories one's own educational institution falls and determining other institutions that may be comparable, this task may be easier.

    Other classroom questions that this page may generate include: Once the students make a hypothesis regarding any of the above or any similar questions, the instructor should invite the students to then determine a strategy for finding an answer and define the type of analysis to be used. Furthermore, the students should speculate on the question, "If significance is found, what can the researcher conclude?" These activities assure that the student sees the practical application of the concepts discussed in the text, acquires data that makes sense to the student, and possibly finds some enjoyment in completing the tasks.

    A Web page such as CollegeNET can also serve a need for some students who need to find a comparative sample from another institution in order to get a better representative sample or to add the comparison as a hypotheses in their theses. Search tools such as College Search can assist in this manner. No doubt, other search engines most likely exist to assist a user in finding businesses, agencies, and other societal institutions by category. The Web can be a quick an inexpensive tool in this endeavor. The Yahoo directory (http://www.yahoo.com) may be the best directory to find similar types of search engines and sub-directories. One of the best print-source Internet directories is The Internet Yellow Pages and The Educator's Internet Yellow Pages which are cited in the bibliography of this paper.

    Survey Construction:

  • The Louis Harris Poll - http://www.techsetter.com/harris/html/home.html

    This Web page contains the current Harris Poll that is identical to the Harris nationwide telephone survey. Its purpose is to see if the political climate of Internet users differs from the country as a whole.

    The home page gives the user the option to either "take the most recent survey", to "look at the results to date", or to "look at the previous survey results". At the time this paper was written, the previous survey results have questions relating to the November presidential elections. The header states that the phone poll was based on interviewing 1005 adults and was taken between July 9 and 21. It goes further to state that the Web poll results are based on 472 participants during the month of August. The survey contains twenty-three questions, the answers from which the participants selected, a column showing the results in bar graph form and percentages of the telephone results, and a similar column with the Web results. An example of one of the questions and its results follows:

    17. What about adoption of children by two women who live together as a couple, whether they are married or not. Do you:

    Harris Poll - Survey Example
    TelephoneWorld Wide Web
    Approve
    16%
    35%
    Disapprove
    61%
    49%
    Don't Feel Strongly About
    23%
    14%
    This type of page provides a tool for discussing with students the construction of well-worded survey items. Also, data collection methodology may be discussed: Which results are more representative of the population and why? Did the researchers take precautions to control the variables that may have been? Not only that, the results lead to an assortment of statistical questions. Do cyber participants differ significantly in political opinion from the society as a whole? Are they more conservative? Less? How would one find this answer from the data given?

  • The Wilmington Institute: Trial and Settlement Sciences - http://www.wilmington-institute.com/

    This Web page is the home page of an institute established to help those in the legal profession to forecast the probable outcome of their trials.

    This page also contains a number of surveys under the heading, "JuryTalk". At the time this paper was written, the menu of surveys included tobacco company lawsuits, the O.J. Simpson trial, and the McVeigh trial. Contrary to the Harris Poll, the McVeigh survey only contains two questions: Do you believe Timothy McVeigh was involved in the planning and/or execution of the Oklahoma City federal building bombing? and If you answered yes to (1), do you believe Timothy McVeigh was part of a well organized and financed, geographically dispersed anti-government conspiracy? The survey then asks the participant to identify his/her age group, gender, ethnicity, and area of residence from a list of possibilities. Once the participant submits the results, updated overall results in terms of percentages appear on the screen.

    Although similar to the Harris page, this page offers some statistical application variation, such as the use of the anova statistic to determine statistical differences between ethnic groups. Another question would be whether the demographic distributions shown on this page represent the population as a whole or even the population of computer users. Also, it provides students with an authentic example of the use of statistics in the workplace.

    II. Concerns and Needs

    An instructor who chooses to use the Web in the classroom must be aware of some potential pitfalls of teaching with technology and working in the chaotic world of the Web:

  • An instructor must realize that each Web site can receive only a limited numbers of browsers. Once the site has reached its maximum, any person who tries to access the site will not gain access. This can be frustrating when a good portion of the class plan is centered around a particular site that is unobtainable. Instructors need to keep in mind that many students sitting in a lab setting all trying to access the same site can also cause some users to be refused.

  • Instructors need to remember that while sites are being updated and improved, others are being removed or left with out-of-date information. An instructor should always attempt to visit a site within a few days before a lesson in which it is being discussed. If a familiar site disappears, a good search engine to use to find another is AltaVista (http://altavista.digital.com) which allows the user to customize the search and a good directory is Yahoo (http://www.yahoo.com) which categorizes thousands of sites under many categories. Again, The Internet Yellow Pages, and The Educator's Internet Yellow Pages are good print sources.

  • Both students and instructors need to realize that "having access to databases" via the Internet can have various interpretations, as has been shown in the examples above. Many Web-based databases provide only a search engine interface where the user is to input his/her parameters of the search. The actual database, however, is housed on the server along with the Web site file. The author of the Web-site has access to the entire database, but a visitor only receives those records that fit the queries of the search. On other pages, such as the Texas Lotto site, pages of raw data are available at one time. The larger the database, though, the least likely the user will find this type of output. If access to an entire, very large database is necessary, maybe an email message to the Webmaster of the related site may result in some assistance.

    In any case, few can argue that the advantages of Web-based databases -- availability, ease of use, and existence of current information -- outweigh many of the inconveniences associated with its use. Another advantage is its unintimidating yet practical attraction to the students. Certainly, once an instructor introduces a few sites such as the ones above to his/her students, the students will return the gift one hundredfold in a few short months.

    Bibliography

    Cyber - Government Resources:

    U.S. Census Bureau
    http://www.census.gov
    Census reports and links to other federal government and international agencies offering statistical reports.

    Fedworld
    http://www.fedworld.gov
    Central location and starting point for finding U.S. government information.

    Government Statistics on the Internet (paper)
    http://www.stats.gov.nt.ca/Bureau/General/WWWPaper.html
    Survey of government statistics (Canada, U.S., U.K.) available on the Internet.

    SEC (Securities and Exchange Commission)
    http://www.sec.gov
    U.S. government site includes filings by public companies.

    Stat-USA
    http://www.stat-usa.gov
    Department of Commerce service offering detailed government statistics-based reports.

    Cyber - Other Resources:

    Facts on File
    http://www.facts.com
    Producer of comprehensive studies of modern issues. Reports include some survey results with statistics.

    The Gallup Organization
    http://www.gallup.com
    Provider of public opinion poll data.

    The Harris Poll
    http://techsetter.com/harris/html/home.html
    Contains the latest Harris poll and comparisons of the previous poll's telephone responses with Internet responses.

    CollegeNET
    http://www.collegenet.com/
    A directory of colleges and universities divided into various categories and search parameters.

    Texas Lotto
    http://crashdummy.iglobal.net/lotto
    The results of the latest Texas Lotto drawing and the results of the drawing over several years.

    Non-Cyber

    Braun, E. (1994). The Internet Directory. New York: Ballantine Books.

    Ellsworth, Jill H. (1994). Education on the Internet: A Hands-On Book of Ideas, Resources, Projects, and Advice. Indianapolis, IN: Sams Publishing.

    Hahn, H. and Stout, R. (1994). The Internet Yellow Pages. New York: Osborne McGraw-Hill.

    Place, R. Dimmler, K. Powell, T. (1996). Educator's Internet Yellow Pages. Englewood Cliffs, N.J.: Prentice-Hall.
    Copyright 1997, Lynne Pachnowski, Isadore Newman, and Joe Jurczyk.

    Appendix - Followup

    March 2, 1997

    Based on the presentation at the EERA conference, here are some additional links that may be of interest:

  • First, here are some specific search engines that were discussed: Lycos and Excite.

  • Here is a site that has compiled numerous search tools, including WWW search engines and newsgroup search tools.

  • These are a couple lists of meta-search engines. That is, sites that search multiple search engines at once, compile the results and remove duplicates. (Meta-Crawler is one that I've worked with on occasion.) The results may not be as specific as you would like, but the meta-search engines often can provide more "hits". That occurs because they are working with a larger base of indexed pages than normal search engines.
  • If you have a web site that you'd like to register with various search engines, go to the Submit It! site. (You'll mostly likely want to use the free service, Submit It! Free. The other services register sites with additional search engines, but the free service covers all of the popular ones.)

  • The prices for access to Stat-USA are detailed here for individuals ($50/quarter or $150/year) and institutions (varies, based on size).

  • The Internet Yellow Pages, that we discussed in the presentation must have gotten so large that the publisher (Osborne) decided to break the book up into several smaller, more specialized offerings.

  • The only place that I was able to locate to order The Educator's Internet Yellow Pages was from Quantum Books.

  • During the presentation, Donna Graham-Harris mentioned an education mailing list, EDINFO, sponsored by the Department of Education. Here is the the information (thanks, Donna):

    ========================================================
    To subscribe to (or unsubscribe from) EDInfo, address an
    email message to: listproc@inet.ed.gov Then write
    either SUBSCRIBE EDINFO YOURFIRSTNAME YOURLASTNAME in the
    message, or write UNSUBSCRIBE EDINFO (if you have a
    signature block, please turn it off). Then send it!
    ========================================================

    If you have any questions about the tools we discussed during the conference, feel free to send me e-mail. ...Joe

    Last Revised on 3/27/97.
    For comments or questions about this page, please contact Joe Jurczyk.