Wikidata:SourceMD/instructions
Typical use
[edit]Overview
[edit]- Get 1 or more identifiers for a publication: a DOI, PMID, or PMCID.
- Go to tools.wmflabs.org/sourcemd/ (fully automated batch mode) or to tools.wmflabs.org/sourcemd/index_old.php (semiautomated mode). These instructions describe use of the semiautomated mode.
- Put the identifiers 1 per line (up to a few dozen should work fine; hundreds or more will likely break).
- Run
- Check output in SourceMD if you wish, then proceed to go to QuickStatements.
- Run in Wikidata:QuickStatements.
- Done!
- Check output in various Wikidata records if you wish.
- Address problems with further Wikidata editing if any identified.
Collect media identifiers
[edit]SourceMD accepts input in these forms:
Traditional citation styles based on paper publishing may not list these identifiers. Citation systems for digital publishing may show them. Often the publication itself will list the media identifier.
-
Public Library of Science (Q233358) features the DOI on its articles
-
PubMed (Q180686) shows the PMID in its item profiles
Put the identifiers into the input box
[edit]List one identifier per line. Use only one identifier per publication; do not use both DOI and PMCID for the same item. When in doubt, use the PubMed Central ID (PMCID).
You can list multiple identifiers on multiple lines; this will create multiple items in a batch.
SourceMD stages information for review
[edit]
SourceMD collects information from off-wiki databases and formats it for inclusion into Wikidata.
The user can edit the text which SourceMD presents. Typically there is no reason to change anything.
SourceMD will provide different information from different identifiers. In publishing academic papers the following sequence of events happens:
- Publishers report to CrossRef that they have media to publish
- CrossRef assigns a DOI to the media and registers it in their database
- About one day later, PubMed checks to see if the media is in their index of medical publications. If it is, then they copy the Crossref data, get the DOI, and assign a PMID to the work.
- About a week later, PubMed Central shares a free-to-read copy of the publication only if they have an agreement to publish it. If they publish, then they take the DOI and the PMID, and also they assign a PMCID.
What this means for Wikidata is that if possible share the PMCID. In this case Wikidata gets the PMCID, the PMID, and the DOI. Anyone sharing the PMID also gets the DOI. Anyone sharing the DOI only gets the DOI.
Transfer data from SourceMD to QuickStatements
[edit]- Press the button "Open in QuickStatements"
- All of the statements previewed in SourceMD will be transferred to the QuickStatements interface
Run QuickStatements
[edit]- If you have not previously authorized QuickStatements, you will be prompted to allow it to make changes on behalf of your account.
- Press the "Run" button.
- When it has finished, you will be able to view the item(s) you created.
-
Authorize QuickStatements to make changes on behalf of your account.
-
Run QuickStatements (click "Run" button).
-
QuickStatements has successfully finished. You can click on the label of the new item to view it.
Consider output of QuickStatements
[edit]After the item has been processed by QuickStatements, you will be brought to a new screen of the newly-created item(s).
- If you ran QuickStatements for a single item, you can view the item from the Done screen.
- If you ran QuickStatements as a batch, there will not be a new screen when it has finished but you can view the batch log report, then select the items that were edited in that batch. From here, click on the item title/Qnumber to review the output of QuickStatements.
-
QuickStatements has successfully added the item.
-
Batch log report showing successfully added item run in batch mode.
Special cases
[edit]Merge records
[edit]identify multiple Wikidata items for one publication
[edit]By error Wikidata may have more than one item for the same media. Correct this error by merging the items.
This error can happen with SourceMD by one person processing one set of identifiers, like a DOI, then another person processing another possible identifier, like a PMID. The tool could create different items.
-
Q55998133 is the item for a paper
-
Q55998226 was for the same paper before being corrected with a merge
Use the merge function
[edit]Please see Help:Merge for a detailed discussion of how to perform merges.
The best way is to use the Merge.js gadget to perform merges. Add the gadget to your account as described on the help page, then use the drop-down "More" menu at the top right of any page to access the merge action.
Verify the merge
[edit]View the edit logs from both pages to verify that the merge was successful. Make sure the redirect works.
Changing the SourceMD formatting
[edit]The SourceMD information can be edited from the output that was generated from the DOI.
- Properties can be added. Terms must be separated by tabs. LAST always adds the statement to the newly-created item.
- If you are adding an article by an author who has a Q number, at this point you could make a statement about P50 (author) with the author's Q number as the value, as the scraped value is always P2093 (author name string). Be sure to maintain the correct P1545 (series ordinal) for that author.
- Deprecated properties can be removed.
Property applied in error
[edit]If there is a statement that was applied in error or is deprecated, you can remove it.
- Click on the edit pencil in the right-hand corner.
- Select "Remove".
In the example shown, P364 (original language of work) is deprecated and should be replaced with P407 (language of work or name) instead.
Duplicated field
[edit]A statement may have accidentally been duplicated.
- Click on the edit pencil for the superfluous statement.
- Select "Remove".
Other
[edit]