Wikidata:Requests for permissions/Bot/Dexbot 6
- The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
- Approved; RfA was also approved. The Anonymouse [talk] 13:59, 19 May 2014 (UTC)[reply]
Dexbot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Ladsgroup (talk • contribs • logs)
Task/s: Merging empty items with related non-empty items
Code: User:Ladsgroup/del.py
Function details: the bot goes through items without sitelink and check their history pages and if there is just one site link (that has been removed) checks the site link and if the site link exists in another item, merges these two, at first phase I'll merge only items that the empty item has no statement at all.
How the bot merges? I give you an example, check out these edits [1] (the bot merges in lower Q-id) [2] and edits bakclinks items Special:WhatLinksHere/Q4051236 --Amir (talk) 09:36, 15 November 2013 (UTC)[reply]
- Question Does the bot automatically request the other item for deletion? The Anonymouse (talk) 17:58, 15 November 2013 (UTC)[reply]
- the bot deletes under my username Special:Log/delete (and because It's a slow and steady process, It won't flood RC) Amir (talk) 05:07, 16 November 2013 (UTC)[reply]
- It deletes them completely automatically, right? If so, it seems like an "adminbot", which has been controversial in the past. Of course, what really matters is the current community consensus.
- Personally, I don't see any issue as long as there aren't too many false positives and the deletions are monitored. The Anonymouse (talk) 06:53, 18 November 2013 (UTC)[reply]
- Yes It deletes automatically but it reflects in rc and anyone can check themAmir (talk) 15:51, 19 November 2013 (UTC)[reply]
- the bot deletes under my username Special:Log/delete (and because It's a slow and steady process, It won't flood RC) Amir (talk) 05:07, 16 November 2013 (UTC)[reply]
@Ladsgroup: can you post the code, please? I myself have made several thousands deletions using merge.py, so I would't be concerned with this task as long as you keep any responsibility. --Ricordisamoa 23:19, 22 November 2013 (UTC)[reply]
- I can send you the code if you want (sorry for delay, I forgot) Amir (talk) 17:37, 23 November 2013 (UTC)[reply]
- @Ladsgroup: I'd prefer the bot code being public to the community, given the potential dangerousness of this task. --Ricordisamoa 03:20, 7 December 2013 (UTC)[reply]
- @Ricordisamoa: you're right, I'll post it somewhere Amir (talk) 08:14, 29 December 2013 (UTC)[reply]
- @Ladsgroup: I'd prefer the bot code being public to the community, given the potential dangerousness of this task. --Ricordisamoa 03:20, 7 December 2013 (UTC)[reply]
- I can send you the code if you want (sorry for delay, I forgot) Amir (talk) 17:37, 23 November 2013 (UTC)[reply]
- Comment the merge should not go into the lower id but into the bigger or more used item (per Help:Merge). --Bene* talk 13:39, 11 January 2014 (UTC)[reply]
- if you think this is a better way I can do it (there is no difference for me) but I prefer to bring this up in WD:PC Amir (talk) 14:33, 14 January 2014 (UTC)[reply]
- I fixed it Amir (talk) 18:09, 14 January 2014 (UTC)[reply]
- @Bene* Why should the merge go in the higher ID? I strongly vote to merge in the most used or in the lower one. Help:Merge also says in section Select recipient item: "choose the one with the lowest Q####, as it is the oldest item" — Felix Reimann (talk) 09:44, 21 February 2014 (UTC)[reply]
- Hi, where did I recommend to merge into the higher id? I only said that the most used item or the one having more sitelinks or statements should be chosen. I assume you agree with that. -- Bene* talk 15:52, 21 February 2014 (UTC)[reply]
- I was not sure what you meant with bigger item. If you meant bigger in terms of numbers of interwiki or backlinks, then I'm very fine with it. — Felix Reimann (talk) 16:10, 21 February 2014 (UTC)[reply]
- Hi, where did I recommend to merge into the higher id? I only said that the most used item or the one having more sitelinks or statements should be chosen. I assume you agree with that. -- Bene* talk 15:52, 21 February 2014 (UTC)[reply]
- @Bene* Why should the merge go in the higher ID? I strongly vote to merge in the most used or in the lower one. Help:Merge also says in section Select recipient item: "choose the one with the lowest Q####, as it is the oldest item" — Felix Reimann (talk) 09:44, 21 February 2014 (UTC)[reply]
- I fixed it Amir (talk) 18:09, 14 January 2014 (UTC)[reply]
- if you think this is a better way I can do it (there is no difference for me) but I prefer to bring this up in WD:PC Amir (talk) 14:33, 14 January 2014 (UTC)[reply]
This is the code you can check it out User:Ladsgroup/del.py Amir (talk) 18:09, 14 January 2014 (UTC)[reply]
- @Ricordisamoa: Amir (talk) 08:55, 15 January 2014 (UTC)[reply]
- @Ladsgroup: you should construct the
Site
object by using the DBname directly, and not by removing "wiki" from it to get the language code, since it may break with other sites than Wikipedia. And BTW thegetReferences
method of the 'pywikibot/core' branch comes with a built-in namespace filter. --Ricordisamoa 18:39, 16 January 2014 (UTC)[reply]- about the first one, I'll fix it, the latter, I'm running via compat (I'm a little bit old-school)Amir (talk) 18:41, 16 January 2014 (UTC)[reply]
- @Ricordisamoa: fixed Amir (talk) 09:04, 17 January 2014 (UTC)[reply]
- @Ladsgroup: you should construct the
- Does the bot check if any items link to the item being considered for deletion? --Yair rand (talk) 06:53, 19 February 2014 (UTC)[reply]
- @Yair rand: yes. It checks that the item doesn't have any back links Amir (talk) 07:26, 19 February 2014 (UTC)[reply]
- Support for items without statements and backlinks. --Pasleim (talk) 16:48, 20 February 2014 (UTC)[reply]
- Support I support this bot and hope that this won't be an one-run bot (request). Matěj Suchánek (talk) 21:57, 21 February 2014 (UTC)[reply]
- surely It won't Amir (talk) 06:20, 22 February 2014 (UTC)[reply]
- Can you make a (semi-random) list of these empty items to give me an idea what we're talking about? And to others: Please don't start deleting or editing them, that defeats the point. Multichill (talk) 10:06, 22 February 2014 (UTC)[reply]
- This is very important comment. Could you generate full list too? I think community must review it very carefully before approving this task. — Ivan A. Krestinin (talk) 18:50, 22 February 2014 (UTC)[reply]
- I'm working on making a sample list. Amir (talk) 09:23, 23 February 2014 (UTC)[reply]
- This is very important comment. Could you generate full list too? I think community must review it very carefully before approving this task. — Ivan A. Krestinin (talk) 18:50, 22 February 2014 (UTC)[reply]
@Multichill: @Ivan A. Krestinin: this is the list and it's automatically completing User:Ladsgroup/A report Amir (talk) 14:41, 23 February 2014 (UTC)[reply]
- Thanks. Could you fix line breaks? — Ivan A. Krestinin (talk) 16:58, 23 February 2014 (UTC)[reply]
- Done :) Amir (talk) 06:16, 24 February 2014 (UTC)[reply]
- Question Super Turrican 2 (Q3977633) and Super Turrican 2 (Q7642890) have different fr-language descriptions. What description will have merged item? — Ivan A. Krestinin (talk) 08:25, 24 February 2014 (UTC)[reply]
- @Ivan A. Krestinin: the description that the bigger item has Amir (talk) 17:02, 25 February 2014 (UTC)[reply]
- It is not good idea. Bigger item can has bad description for some languages. General idea: bot must not loose any good information. If bot can not save all information or can not decide that information is better the bot must skip conflicting item pairs. The pairs are needed manual processing. — Ivan A. Krestinin (talk) 18:44, 25 February 2014 (UTC)[reply]
- @Ivan A. Krestinin: you're right, I'm gonna change it to consider checking description conflict and if it couldn't decide, abandon it. Thank you for sharing this concern. Best Amir (talk) 07:22, 1 March 2014 (UTC)[reply]
- It is not good idea. Bigger item can has bad description for some languages. General idea: bot must not loose any good information. If bot can not save all information or can not decide that information is better the bot must skip conflicting item pairs. The pairs are needed manual processing. — Ivan A. Krestinin (talk) 18:44, 25 February 2014 (UTC)[reply]
- @Ivan A. Krestinin: the description that the bigger item has Amir (talk) 17:02, 25 February 2014 (UTC)[reply]
- So we have this backlog. With this we'll clear that, but what happens after that? Maybe we can create daily reports for admins to check? Could be based on the date of the last revision. Multichill (talk) 20:20, 24 February 2014 (UTC)[reply]
- If you want my bot can report deletion log in a place with telling the reason of deletion. It's not a big deal. Amir (talk) 17:02, 25 February 2014 (UTC)[reply]
@Ladsgroup: please take a look at Wikidata talk:Bots#Merging items. Regards, --Ricordisamoa 13:28, 8 March 2014 (UTC)[reply]
@Ivan A. Krestinin: It's fixed now. @Ricordisamoa: this is great but I can't use for several reasons: 1- I use compat not core (I'd love to port it but I don't have time for it right now) 2- the most important issue in here is to check the item is empty enough to delete it and because Q-number doesn't matter anymore my code won't do merging now (I wrote a code to merge but I can't use it because now the Q-numer doesn't matter and it's not okay to copy content of a big item to another item just because the latter has lower Q-id) Amir (talk) 13:45, 8 March 2014 (UTC)[reply]
- @Bene*, Vogone, Legoktm, Ymblanter, The Anonymouse: Any 'crat to comment?--GZWDer (talk) 04:56, 30 April 2014 (UTC)[reply]
Is this ready for approval? The Anonymouse [talk] 17:06, 7 May 2014 (UTC)[reply]
- @Ladsgroup:, but @The Anonymouse: It seems that an admin flag is needed.--GZWDer (talk) 05:11, 8 May 2014 (UTC)[reply]
- Do you think I need to make an WP:RfA for it?Amir (talk) 11:28, 8 May 2014 (UTC)[reply]
- According to bot policy, yes. Also, you might want to run your bot on your main admin account and make a few test deletions, if you think it needs testing. The Anonymouse [talk] 15:42, 8 May 2014 (UTC)[reply]
- okay, I did several tests Amir (talk) 21:47, 11 May 2014 (UTC)[reply]
- See Wikidata:Requests for permissions/Administrator/Dexbot. The Anonymouse [talk] 16:12, 12 May 2014 (UTC)[reply]
- According to bot policy, yes. Also, you might want to run your bot on your main admin account and make a few test deletions, if you think it needs testing. The Anonymouse [talk] 15:42, 8 May 2014 (UTC)[reply]
- Do you think I need to make an WP:RfA for it?Amir (talk) 11:28, 8 May 2014 (UTC)[reply]
Thanks for posting code Amir. I think the skip condition "if linkpage.namespace()==0:" should be more inclusive, or possibly even removed (initially?). I have had items deleted when they were being discussed on a task force discussion page, and for a non-admin this is very annoying as all that is left is Qddddd, with no way to remember what it was. (I had one restored, but there is still one red link at Wikidata talk:Sport results task force). Also consider that there might be an active discussion about the item on WD:RfD or WD:PC. Perhaps run the bot skipping any item with an incoming link until the backlog is cleared, then let humans review the backlog of items with incoming links from other namespaces, and then try to introduce some logic which only deletes the ones that are not valuable. John Vandenberg (talk) 17:16, 13 May 2014 (UTC)[reply]
- I think excluding user and user talk namespaces is good enough, what do you think? Amir (talk) 11:39, 14 May 2014 (UTC)[reply]
- @Ladsgroup:, just checking ... do you mean something like the following?
for linkpage in data.getReferences():
if linkpage.namespace() not in [2,3]:
Do=False
- If so, that sounds like a good way to run the bot to clear the majority of the junk out. John Vandenberg (talk) 11:51, 14 May 2014 (UTC)[reply]
- exactly. I'll change it Amir (talk) 11:54, 14 May 2014 (UTC)[reply]
- If so, that sounds like a good way to run the bot to clear the majority of the junk out. John Vandenberg (talk) 11:51, 14 May 2014 (UTC)[reply]