Web Indicator Reports

Here are the steps necessary to collect web data and calculate a range of indicators for a collection of publications, including the Mean Normalised Log-transformed Citation Score (MNLCS) and the Normalised Proportion Cited (NPC).

  1. Step 1: Identify the group of publications to be assessed and categorise them by field (e.g., using Scopus or WoS subject categories).
  2. Step 2: Save the article information (authors, title, journal, publication year) in a standard tab-delimited format in a separate file for each subject category/year combination. First, discard publications that are in small subject/year combinations (e.g., <100 publications). Create tab-delimited files for the each subject/year. There should be one line per publication. Each line should contain the author names in standard format (following Scopus or Web of Science formats would be ideal), the publication year, the article title and the journal name (ignore this for books). The first line of the file should contain header information. Here is an example of the format for journal articles and for books. If your data is in a spreadsheet, it can be saved in this format using the Save As command and selecting the Plain text (tab delimited) format. The filename for each file must contain the subject name and year, and end with -[group].txt, where [group] should be replaced by a name for the collection of articles. The same [group] should be used for files containing publications from the same group. If the files are in Scopus of the Web of Science then choose the tab delimited format in which to save them.
  3. Step 3: For each retained subject/year combination, a benchmarking sample is needed of articles from the rest of the world. For this, download all articles from the Scopus/WoS (if possible) field/year or a large balanced sample (e.g., the first and last 5000 articles published in the category) for the world reference set. Filter out any large trade or art journals with a high proportion of uncited articles. Name the files using the standard Webometric Analyst naming convention so that each filename contains the subject name and year, and ends with -world.txt. These filenames must exactly match the group filenames, except for replacing -[group].txt with -world.txt. All of the files should be stored within a single folder that does not contain any other files.
    1. Here is a small artificial example of a complete set of publication data files in structured name format, with all publications in a single file being from the same field and year, and each group file corresponding to a world file.

  1. Step 5: Since Bing API searches need to be paid for after the first free 1,000, unless you have a budget, the next stage is to generate a random sample of articles from the world and group sets (e.g., 500 per set) and use these samples instead of the full set. For this, from the Make Searches menu, select the Replace search files with a random sample up to a maximum number menu option and instruct Webometric Analyst to replace all the search files with random samples of 500.
  2. Step 6: Use Webometric Analyst to run all the searches. For, this, start Webometric Analyst, enter your search key, close the Startup Wizard, click the Run All Searches In File button and select one of the search files. Wait for Webometric Analyst to finish and then click the same button again and select another file. Repeat this until all the files have been run. The picture below shows some of the files generated for PowerPoint searches, togther with two additioanl files created in Stage 7. Example files for Wikipedia searches.

  1. Step 7: Use Webometric Analyst to calculate MNLCS, gMNCS, EMNPC (NPC) and MNPC and confidence limits for both. For this, start Webometric Analyst, close the Startup Wizard and then select Calculate MNLCS, gMNCS and NPC for a set of web searches (structured file names) from the Reports menu. Select the folder containing all of the files, when requested. This will create two new files. The file called all_data.txt, contains all of the data extracted from the searches in a format that can be loaded into a stats package or spreadsheet. This is a backup file in case you want to calculate your own indicators. The file called report.txt contains MNLCS, gMNCS and NPC values for each individual file in a long list at the top. Near the end of the file it then reports tables of the combined MNLCS, gMNCS, EMNPC (NPC) and MNPCC values for the whole collection. This is the main part of the results. Note that EMNPC is the new name for NPC.
  2. Step 8: If you want MNLCS, EMNPC (NPC) and MNPC calculated separately for each year, then create new folders, one for each year, and copy all the files from each year into the relevant year folder. Repeat the step above for each year folder.