Toes in the Agile pond: week 1
I am currently attending the daily standup as the mySociety team develop the tool I am referring to as:‘Scraper to Wikidata via validity check’ tool (I think that name needs changing!)
The team have been working on this for a week now, although some of the team have been at a Developers Conference. Below is an outline of the bits I ought to understand (I don’t need to learn coding…yet!) as I currently understand them (which means likely there are a few misunderstandings!). The following summary is informed by asking the team to explain it like I’m a 5 year old, and then reading the Github issues.
Goal for the day: To get a page out that could be used, even if v.basic. This was so user feedback could be given before any more work — built on assumptions of the users needs — went ahead. I like seeing rapid prototyping in different contexts (previously it was street design and it was far less rapid!)
The India morph.io was the basis to start working on. This is a store(?)of data taken from English Wikipedia by EveryPoltician scrapers, about the current Politicians elected to the lower house of India’s Parliament aka 16th Lok Sabha. The reason this is a good place to start was because:
- All people, parties, and constituencies already exist in Wikidata, and are pre-reconciled — this I think means the Wikidata ID associated with each person, party or place is already linked to the relevant Wikipedia page).
- Very few politicians already have a relevant P39 property on Wikidata. Where they do, the qualifier is either the term they are serving or some have a start time. Most Politicians have either a bare P39 (no qualifiers) or no P39 at all.
Note, the team used a mob programming approach, meaning they all worked on the project together (albeit remotely).
Day 2: 27/06/2018
Goal №2: The user (Tony) needs to to be able to click buttons that add data into Wikidata, ideally including the Political party. What the team will be doing
- Fixing: The tool is currently skipping the step whereby the user can verify that the person has held a position and add. It seems to be skipping this because the Wikipedia item is already reconciled with the persons Wikidata ID, but this step key in adding P39 info.
- Allowing the tool and user to add into Wikidata the party information for the person (where it exists on Wikipedia).
- Fixing issue: #153 : Where a person has held a previous position and is currently in a new position (P39) which hasn’t yet ended, the tool is flagging up — incorrectly — that they currently have two P39 statements.
Day 3: 28/06/2018
Yesterday there was a lot of merging of code to deliver and the team did not manage to get a new version deployed.
Goal №3: To get as much deployed from yesterday’s merging and reviewing, in order to see the value of the work.
- Deployment has to happen before next steps defined, but looking likely to be:
- Fix #154 : The tool is extending a start date, to be before the official Term of the acting legislation, adding in inconsistent quality data and not recognising that the person held a different position prior to that role.
- Fix #153: Reporting a person holds two P39’s, when in Wikidata they only hold one
- Fix #169: This issues is not currently needed to be fixed (because we are working with data where the WIkidata item for the person already exists) but, in this is a capacity the tool ought to have in time. (and that has been used/ planned to be used for getting crowdsourced data)
Day 4: 29/06/2018
Current status: And an almighty pull request happened (which meant a lot of reviewing needed to be done by one developer) and this slowed things a little. mySociety apparently have a code which suggests Pull requests cover merging/ reviewing one item at a time…
- Yesterday we deployed #168 . We now have a version that is making updates in Wikidata.
- #170 (which is about the tool supporting the user to verify the party name) has been reviewed, but is not yet ready to deploy. Some interesting limitations still in the tool, such as you cannot yet create an item for the party, if one doesn’t yet exist.
- Ensure the tool can link/ reconcile a Wikipedia item with a Wikidata ID if they are not yet linked (#183)
- Adjust tool so it can also add people who though elected, do not have a constituency (#182) such as people who are elected to represent a minority within the country, example: Lusien Ibram who represents the Turkish minority within Romania
- Poss Quick win: #175 Every statement in Wikidata is ideally verified by a referenced source. The tool needs to be able to add additional references as qualifiers to statements.This is a useful one to get done as early as possible, so that the data we are inserting is of a good quality.
- Finish and deploy #170 (see above)
- Then look at not needing to re-reconcile the same party more than once per source page. Aka if you verify a party once, the rest then happen without the need for the user to verify the existence of the party.
- Look into items needing manual intervention that probably shouldn’t (#153, #154)
Day 5: Monday 2nd July
Current Status: There was a bug (issue#186) that has blocking the deployment of any new versions, because it was extending the start time (see issue #154 above) and meant we couldn’t use the tool because we might end up putting bad quality data into Wikidata, which would not do anyone any favours and may annoy Wikidata community members. #170 has been deployed but is not having any effect because of #186… and a member of the teams computer is struggling to deploy some items because of computer memory and the fans going mad in the summer heatwave. The user is keen that we stick to getting a version out a day and so the team are honing goals to ensure they factor in deployment by close of day (so goals might need to be smaller)
- Fix the bug! Devs going to pair to work on 186 together
- #153 (see above) needs reviewing before being deployed
- Other issues such as #154 might be related to the bug… I actually already thought these were the same issue!(note to self: detail Georgie!)
- #175 still a potential for a quick win
- Get a version out: needn’t be perfect
Summary to self:
- To understand the product, don’t miss the first planning stand up (in fairness I was at the doctors) and have a go on the prodcut to know what it should be / could be/ currently is able to do… before reading issues!
- Forget the other similar products you have used, because it was for a different purpose
- Ask for clarification when people say “oh the thing is now doing that thing we hoped it would do”… otherwise I will get very behind