Wikisource:Scriptorium/Archives/2019-11

Please do not post any new comments on this page.
This is a discussion archive first created in November 2019, although the comments contained were likely posted before and after this date.
See current discussion or the archives index.

Transcription completeness and traditional vs. modern finding aids (i.e., tables of contents, indices, search engines)

I am interested in learning what Wikisource editors think about the relative value of tables of contents and indices, considering that search engines often meet similar needs. In some cases, it seems to me it may be a poor use of an editor's time to create a sophisticated transcription of an existing TOC or index. I've gone into some detail on this here: Talk:Oregon Historical Quarterly

Of course, a reader's needs will vary from one kind of work to the next. I'm not looking for an absolute rule (especially considering that choices about what work to prioritize on Wikisource are made by individual volunteers), but I'm curious about what principles other Wikisource editors apply when making these kinds of decisions. Pinging several editors I've discussed similar issues with: @Kaldari, @Beleg Tâl, @Billinghurst, @EncycloPetey: -Pete (talk) 03:34, 1 November 2019 (UTC)

@Peteforsyth: Well from a practical viewpoint the conversion utilities for preparing EPUB etc. publications pretty much assume the entire piece is linked (at least indirectly) from the opening page. In a (non-trivial) traditional publication that normally incorporates the contents page(s) and the utilities rely upon this. Trying to transclude a work without such linkage may look great from within wikisource but more or less guarantees outsides will see something most frustratingly (and to them incomprehensibly) incomplete… 114.78.66.82 04:03, 1 November 2019 (UTC)

I have basically the same viewpoint. I transcribe and transclude the tables of contents (with links) so that the exporting tools will work correctly. I usually don't bother to add links within indexes, however. Kaldari (talk) 04:10, 1 November 2019 (UTC)

Agree with above regarding the Contents. And for works with many chapters or sections, the contents allow quick navigation to a specific part from the main page of the work. They can also provide an overview to the reader when there are multiple parts with chapter numbering that restarts with each part. that is, it is not unusual to have more than one "Chapter 2" in a work, and relying on a search to find the specific Chapter 2 you are looking for is not ideal in such cases. Regarding Indices: these can index more than simple words. They also index topically, and for larger topics they inform the reader where specific subtopics are covered. --EncycloPetey (talk) 05:05, 1 November 2019 (UTC)

Comment ToCs tie a work together, so as long as you have a way to tie transcluded subpages, that is the important component. I value ToCs in a work, that set the beginning of a work nicely IMNSHO. I also think that they represent the author's final rendition of a work.
Indices have never been consider the priority, though they are nice and specific to a work if someone wants to browse a work's contents in a little more detail. We never fuss over their absence, though when done, we add them in. As most people don't include {{engine}} to non-fiction works, and as such searching with a work is often not readily available for most readers, the index does cover the absence. Phe used to have a good script to link convert validated indices to page links, though that stopped a while ago. It is a nice touch if you get that far, though not one anyone would or should castigate for its absence. I will comment that indices are regularly mentioned in book reviews, for their absence or quality, so I find that of interest. — billinghurst sDrewth 05:43, 1 November 2019 (UTC)

I don't see where a search engine reduces the need for a Table of Contents; that's where the author tells you what's in the book, and directs you to the primary section where each major subject is covered. Indexes are a lot of work to properly transcribe and link, but a good index can offer you directions to stuff that may not come up easily if searched (e.g. canonical names or phrases that are hard to search for). It's not the first thing I'd get on, but it has value.--Prosfilaes (talk) 07:28, 1 November 2019 (UTC)

Thank you all, this is very helpful. It's great to hear the thorough views of experienced editors, and it's all persuasive. And it seems useful to think about index and TOC pages as slightly different entities.

It's good to know that there is not a strong sense that index pages must be included; I do agree with EncycloPetey and others that they have utility beyond what machine-generated search can provide, but the effort-to-reward ratio isn't always great enough to motivate me to transcribe them.
With the ToC, I also agree that they're more important, but I'm realizing that I've encountered an unusual situation with the Oregon Historical Quarterly. I'm still not sure what the best way to proceed is in this case, but it's not worth getting into in a general discussion. If anybody has the patience to take a closer look at how I've set up those pages and discuss it, I'd appreciate your comments at Talk:Oregon Historical Quarterly. -Pete (talk) 16:50, 1 November 2019 (UTC)

NOINDEX meta tag

Is there a reason that we don't place a NOINDEX meta tag on index: and page: pages? I think it would be a better outcome for the potential reader if a Google search on a book title returned mainspace results only. Moondyne (talk) 09:20, 2 November 2019 (UTC)

I don't know if there is a reason. It seems like a reasonable suggestion. —Beleg Tâl (talk) 17:04, 4 November 2019 (UTC)

Wikilivres

Wikilivres is down, and has apparently been so for some time. Does anyone know the prognosis? We have a lot of links to it, from this project, and sister projects. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:11, 4 November 2019 (UTC)

No idea. I would suggest that if we believe that it is not coming back that we can look to creating a new landing page at Meta: (or an agreed alternate site) that explains the site and that it is no longer active. I can then get the mapped interwiki link pointing to that page. We can then work out what we want to do with the links, and explore what is happening with the site. — billinghurst sDrewth 21:19, 4 November 2019 (UTC)

Tech News: 2019-45

Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.

Recent changes

At Special:Contributions you could see up to 5000 edits at the same time if you edited the URL. This has been lowered to 500. This is to stop requests which break the sites. [1]

Changes later this week

MediaWiki:ipb-default-expiry can set the default length to block a user for your wiki. You will be able to use MediaWiki:ipb-default-expiry-ip to set a different default block length for IP editors. [2]
The new version of MediaWiki will be on test wikis and MediaWiki.org from 5 November. It will be on non-Wikipedia wikis and some Wikipedias from 6 November. It will be on all wikis from 7 November (calendar).

Meetings

You can join the technical advice meeting on IRC. During the meeting, volunteer developers can ask for advice. The meeting will be on 6 November at 15:00 (UTC). See how to join.

Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.

16:47, 4 November 2019 (UTC)

Community Wishlist 2020

The 2020 Community Wishlist Survey is now open! This survey is the process where communities decide what the Community Tech team should work on over the next year. We encourage everyone to submit proposals until the deadline on November 11, 2019, or comment on other proposals to help make them better.

This year, we’re exclusively focusing on smaller projects (i.e., Wikibooks, Wiktionary, Wikiquote, Wikisource, Wikiversity, Wikispecies, Wikivoyage, and Wikinews). We want to help these projects and provide meaningful improvements to diverse communities. If you’re a member of any of these projects, please participate in the survey! To submit proposals, see the guidelines on the survey page. You can write proposals in any language, and we will translate them for you. Thank you, and we look forward to seeing your proposals!

IFried (WMF) 19:30, 4 November 2019 (UTC)

Is posting to a website considered publication here just as it is in US copyright law?

Is posting to a website considered publication at Wikisource just as it is in US copyright law? Is the distribution of visibly perceptible copies considered publication at Wikisource as it is in US copyright law? See User:Richard_Arthur_Norton_(1958-_)/Vunk_-_Quick_Burial_Ground moved to user space because not considered "published". The data is not eligible for copyright because it consists entirely of information that is common property and contains no original authorship. Let me know what you think. --Richard Arthur Norton (1958- ) (talk) 15:42, 4 November 2019 (UTC)

@Richard Arthur Norton (1958- ): Yes and no. Our governing policy on the subject is Wikisource:What Wikisource includes. Analytic works like Reimer's cemetery listing "must have been published in a medium that includes peer review or editorial controls; this excludes self-publication". Most of the time, posting to a website does not fulfil this requirement, as it is usually the author self-publishing the content, or a platform hosting the content without editorial control. —Beleg Tâl (talk) 16:58, 4 November 2019 (UTC)

The South Brunswick Township Public Library published it on their website in April of 2019 based on the copy deposited by Reimer in 1977. The website version has the SBTPL annotations and their indexing at the top of the page, putting it under their editorial control. If she had posted it on her personal website or in a personal blog, it would be self-published. See: http://www.sbpl.info/township/cemeteries.htm#Vunk and https://www.sbpl.info/wordpress/wp-content/uploads/2019/04/South-Brunswick-Cemeteries-By-Burial-Ground.pdf --Richard Arthur Norton (1958- ) (talk) 17:15, 4 November 2019 (UTC)

I'd note that posting to a website is not necessarily considered publication under US copyright law. The Copyright Office says it's unclear, and it seems that it's frequently treated as broadcasting. To be clearly publication, the page would have to explicitly offer downloads or otherwise transfer copies to people.--Prosfilaes (talk) 01:04, 5 November 2019 (UTC)

Please note that it is published as a webpage and as a pdf. When I click on the pdf in my Chrome Browser it downloads automatically and opens in my Adobe pdf viewer, so yes it is physically distributed. --Richard Arthur Norton (1958- ) (talk) 17:52, 6 November 2019 (UTC)

If the same site or the blog published a poem from a four year old, would you consider it published? I believe that we are talking about a reasonable process of peer-review, and the website you discuss and their processes is not clearly that peer review.. — billinghurst sDrewth 10:32, 5 November 2019 (UTC)

This isn't a poem, it's a list of names and dates. Regardless of whether the above PDF can be considered "published", I would say it is out of scope. --Xover (talk) 11:14, 5 November 2019 (UTC)

"unless it is published as part of a complete source text"—lists of names and dates can be in scope. —Beleg Tâl (talk) 12:01, 5 November 2019 (UTC)

Your example is just such a larger work that provides context to the list, even if the list is the main meat (unlike, e.g., the list being a short appendix to a hundred+-page work). It's in the borderlands, I would say, but comfortably over the line for inclusion. But the list at issue here is just a list of names and dates because it is excerpted from that context in order to circumvent copyright restrictions, making it out of scope as an excerpt too. With the context it would be copyvio. --Xover (talk) 12:09, 5 November 2019 (UTC)

(ec)the list or not list is not the issue, it is the peer reviewed component. Websites can publish any set of snippets without a clear editorial decision, and review process for us to host the work. If the list is already published, then we can just as easily link to it from the author page to its present site. — billinghurst sDrewth 12:13, 5 November 2019 (UTC)

Agreed —Beleg Tâl (talk) 13:03, 5 November 2019 (UTC)

it could be data for either wikidata or commons dataset. the site says: "A paper copy is available for use at the South Brunswick Public Library." i would prefer a discussion about scope issues before moving, for some consensus. it would be better to have the scans from the "Somerset County Historical Quarterly," [3] but unclear that cut and paste works are disruptive. better to pivot to more productive editing. Slowking4 ⚔ Rama's revenge 16:35, 6 November 2019 (UTC)

Copyright in Ethiopia and Template:PD-Ethiopia

Please note that our {{PD-Ethiopia}} is out of date: Ethiopia enacted a copyright law in 2004 but our template refers to a 1960 version. The old law provided protection only during the author's lifetime, but the new law is pma. 50 with some PD-EthiopianGov type exemptions. Crucially, however, our template does not distinguish between copyright status in Ethiopia and copyright status of Ethiopian works in the US.

Since Ethiopia still does not have copyright relations with the US, no Ethiopian works are currently protected by copyright in the US, and can be freely hosted here.

However, if transferring a file to Commons the distinction becomes relevant. In those circumstances, do not depend on our {{PD-Ethiopia}} tag! Each file with this tag will need to be assessed individually.

Ideally we would modify our Ethiopia-related licensing templates and then review and correctly tag all works in Category:PD-Ethiopia with both Ethiopian and US copyright status (some works may be eligible to move to Commons even under their stricter policy). --Xover (talk) 19:11, 6 November 2019 (UTC)

Fixed —Beleg Tâl (talk) 19:08, 8 November 2019 (UTC)

List of index pages

Tracked in PhabricatorTask T232710

How is the List of Index Pages supposed to work? It seems that it always gives the same results no matter what is filled in the Search field. --Jan Kameníček (talk) 15:45, 8 November 2019 (UTC)

The results page says "The search engine does not work. Sorry for the inconvenience." So I assume it's supposed to work normally but is broken. —Beleg Tâl (talk) 18:48, 8 November 2019 (UTC)

Oh, thanks, my fault… --Jan Kameníček (talk) 00:36, 9 November 2019 (UTC)

Although my experience with Phabricator is much worse than bad, I have given it a try and reported it, see task T237831. --Jan Kameníček (talk) 20:54, 9 November 2019 (UTC)

In fact it had already been reported two months earlier: task T232710 --Jan Kameníček (talk) 17:51, 10 November 2019 (UTC)

Tech News: 2019-46

Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.

Recent changes

MediaWiki2LaTeX can put different pages from a Wikimedia wiki into a PDF. It can now make a PDF with around 5000 pages. Previously this was 800 pages.

Changes later this week

There is no new MediaWiki version this week.

Meetings

You can join the technical advice meeting on IRC. During the meeting, volunteer developers can ask for advice. The meeting will be on 13 November at 16:00 (UTC). See how to join.

Future changes

Wikimedia will take part in Google Code-in. This is for young students who want to help with open source software. You can read more. Experienced technical Wikimedians can mentor students.

Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.

22:02, 11 November 2019 (UTC)

New template for hyphenated words across pages

Having just run across a work that tripped multiple edge cases in how ProofreadPage joins together pages, I finally put together a utility template to make dealing with these easier.

Peh! (aka. {{page end hyphen}}, aka. please suggest a better name! 😁)

The details are in the template documentation, but the short version is that if you have a hyphenated word that has been split across pages (i.e. where the word should still be hyphenated when transcluded into mainspace), or where the page ends with something (like an em-dash) that should be joined with the following page without inserting a space character, you can throw a {{peh}} at the end and get the desired effect in both Page: and mainspace (or in Translation:, or anywhere else). It defaults to a hyphen (“-”) when no arguments are provided, or uses its first argument otherwise (e.g. {{peh|—}}).

It's had limited testing, but it's simple enough that I don't think there's much risk of weirdness.

Oh, also, you can (I think) achieve the exact same results using {{hws}}/{{hwe}}, so if you're already using those for this then there's no particular reason to switch. This is just intended as a simpler and easier to use way to achieve the same result for those of us (surely I'm not the only one? Right? Right…?) who find {{hws}}/{{hwe}} complicated and confusing to use for these scenarios. --Xover (talk) 17:58, 13 November 2019 (UTC)

Hm, that is a clever workaround… Much easier than hws/hwe. --Jan Kameníček (talk) 18:07, 13 November 2019 (UTC)

Spelling errors

I've forgotten the guidance on spelling errors in the original. "Seventeeth" in https://en.wikisource.org/wiki/Page%3APlomer_Dictionary_of_the_Booksellers_and_Printers_1907.djvu/177 ... Rich Farmbrough, 19:10 12 November 2019 (GMT)

You can use the {{SIC}} template. --Jan Kameníček (talk) 20:02, 12 November 2019 (UTC)

@Rich Farmbrough: We reproduce as they are. If you do use the template as suggested above, it is up to you whether you include text in the second parameter. Some consider it an annotation and and assumption so do not like it, some do like it. Personally, I use it though generally leave it empty unless it is really helpful to explain the alternate word. If you want to silently leave something inline, then we also have {{sic}}. As a note, if there is a whole swag of old text being reproduced we would not tag it, we let it stand. As per WP in wikilink first error, we would only tag the first error of each type. I saw that Martin has highlighted that work in a WD talk that work that he and I did. — billinghurst sDrewth 06:30, 13 November 2019 (UTC)

If you tweet, especially about Wikisource

Hi. For those who are on Twitter and tweet about Wikisource, a new reminder that some of us maintain @wikisource_en so please do include that account in your tweets as appropriate. Either in twitter, or here, please let us know your account so that we can follow. — billinghurst sDrewth 06:25, 13 November 2019 (UTC)

I'm @pigsonthewing, and will follow the above account as soon as Twitter lets me. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:35, 13 November 2019 (UTC)

AuFCL / MODCHK / random IP editor 114...

Dear AuFCL / MODCHK / random IP editor 114... Hoping that the NSW fires are not near to you, thinking that they are to your north-west and south-west. Best of luck with what is coming through your area. Wildfire sucks. — billinghurst sDrewth 12:16, 13 November 2019 (UTC)

Amen! --Xover (talk) 16:00, 13 November 2019 (UTC)

Greek: Aerodynamics

Could somebody who is able to read and write (rather: type) Greek please enter the words in that language on Page:Aerial Flight - Volume 1 - Aerodynamics - Frederick Lanchester - 1906.djvu/415 and the following page? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:46, 13 November 2019 (UTC)

@Pigsonthewing: If you use {{Greek missing}}, the page will be added automatically to Category:Pages with missing Greek characters which is monitored by users who can type Greek. —Beleg Tâl (talk) 15:10, 13 November 2019 (UTC)

@Beleg Tâl: Something new to learn every day. Done, thank you. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 15:31, 13 November 2019 (UTC)

Add Wikidata link to Index page

I made a thing: User:Samwilson/LinkIndexToWikidata.js. It adds a 'Wikidata item' row to the metadata table on Index pages, linking to the Wikidata item that refers to the Index page via Wikisource index page URL (P1957). If there's no link, it complains to you to fix it. :) To use, add this to your common.js page:
mw.loader.load('//en.wikisource.org/w/index.php?title=User:Samwilson/LinkIndexToWikidata.js&action=raw&ctype=text/javascript'); —Sam Wilson 23:24, 10 November 2019 (UTC)

@Samwilson: (Stupid-hat question) Why don't we just add the field to underlying template? Then we can gadgetify the script to make it more available. Or do we just gadgetify it anyway? — billinghurst sDrewth 06:36, 13 November 2019 (UTC)

@Billinghurst: Good question! It's because there's no sitelink from an Index page to its Wikidata item; the only link is via the URL stored in Wikisource index page URL (P1957), so the way a script can do it is by making a Wikidata Query Service request. A template (or Lua module) can't do that. Or do you mean, why don't we add a field for Wikidata ID to the template? That'd work, but it's duplicating the data (which is maybe not a bad thing; similar things are done elsewhere in the system). —Sam Wilson 12:25, 13 November 2019 (UTC)

I stopped bothering adding the index: backlink. It isn't in the WEF framework, and I just stopped bothering as it seemed to be of limited value. If it is being added to the {{book}} template at WD, then we can inhale it with the existing script, or we can enter it manually. Means that I created it a bit earlier. I sometimes wonder whether the duplication may allow for bots to better come along and tidy up. <shrug> — billinghurst sDrewth 12:49, 13 November 2019 (UTC)

Truth be known samwilson I would like to have more of the {{book}} data on the Index: page, hopefully passively added from Commons, or pulled from WD, rather than another manual addition. For instance I would like that where we have an IA work that we can have active link to that work. I want to be able more readily link to the jp2 zip file of the work so we can better work with image extraction and clean up, with our no longer actively supporting {{raw image}} extraction. Unfortunately I haven't found an online tool to open an online zip and extract single images, though I am still looking.
Now I don't know the best way to complete the three way dance with Commons and Wikidata, and it is always our issue that IA starts, Commons comes 2nd, then enWS Index: 3rd, enWS main ns, 4th, then usually WD comes 5th. If WD could occur at step two or step three (more automagically) and then Index: page that would be beautiful. Though that wish has never been fulfilled, and I have asked people like Lucas Werkmeister at a conceptual level … to silence, we are way down the food chain. RexxS is really helpful, though I don't like to push acquaintanceships too hard.
What those ignoramus thinks we need is <mode start=dream>

Update to MediaWiki:Proofreadpage index template for fields
though maybe it is a separate template can manually insert to start, or passively embed based on data links to WD (I dunno exactly, out of my paygrade)

Update to MediaWiki:Gadget-Fill Index.js which is the gadget the extracts data from Commons files and adds to respective Index: fields

— billinghurst sDrewth 01:46, 14 November 2019 (UTC)

@Billinghurst: I'd love to help, of course, but I'm a complete noob here and I don't understand the workflow or the terminology you're using. Checking random works and authors, I find Wikidata links, but no link for a random transcription. Is that where you're stuck at? When I looked at Index:Paradise Lost (1667).djvu, I could see that the linked title and linked author both have Wikidata items, but obviously not the transcription. I think for the moment, Sam is right - you need WDQS to do the reverse lookup. However, I suspect that it should be possible to have a field on the index page that records the Wikidata item containing that link once it's been found. Magnus Manske has a bot that can create lists on wiki-pages from the results of a WDQS query, so maybe a bot run could populate such a field for you? I'll another think and see what I can work out. RexxS (talk) 18:56, 14 November 2019 (UTC)

Thanks RexxS. The work you found is just going to be complicated for a range of reasons, so let me try something cleaner.

I have prepped a completed and transcluded work hopefully as a better example.

WD edition item (well-populate; I make no promise that item cannot be further populated)
What would happen to the Irish Minority
Index:What would happen to the Irish Minority.djvu
c:file:What would happen to the Irish Minority.djvu
Internet Archive identifier: whatwouldhappent00dubl.

The three djvu-like pages they are suitably populated with expected data. All inter-related, and each containing different data. Noting that the WD item is for the edition, I haven't created one for the conceptual "book"

For a work in progress of transcription: Index:The best hundred Irish books.djvu <-> c:File:The best hundred Irish books.djvu <-> Internet Archive identifier: besthundredirish00obri, no wikidata item yet as I usually create those at the end, and no book item as I gave up creating those as too much extra effort.

If we need to get down and dirty then maybe we should pick a user talk space for the conversation, or a scratch space, or an IRC chat. <shrug> Guide me, I am really happy to step through things. Noting my [understanding of WikidatatIB = knowledge of WDQS = capability in Module: ns]. (I suck at programming … conceptual hole). — billinghurst sDrewth 01:26, 15 November 2019 (UTC)

Ability to individually access single JP2 images from Internet Archive work archives

Now I may be completely slow on the uptake, however, today I have just identified that we can directly download individual JP2 files for the pages from a work. [If others had noticed this, then I apologise for missing your communications on this matter.]

Anyway this means that with something like GIMP, you can directly paste in the url of the JP2 page into GIMP > Open location and load it straight into application and edit the best quality file.

@Xover: I think that this means we can probably steal best quality pages from another copy of the same work and rebuild files. Correct?

To see a file list from a file's /details/ page at Internet Archive follow the SHOW ALL link > beside the .zip click the "View Contents" link and VOILA a file list where you can grab a useable link. example link https://archive.org/download/whofearstospeako00cuma

@Samwilson: if you can suck in the IA link to an index file, we can simply template this based on

https://archive.org/download/<ia-identifier>/<ia-identifier>_jp2.zip/

Alternatively maybe we get this linked up from within the book template at Commons. — billinghurst sDrewth 03:45, 14 November 2019 (UTC)

Comment I am thinking that at least as an initial measure we could build an optional manual IA field into {{raw image}} that can turn on a component that displays text and link to the directory listing of the JP2 file. At some later point, when we have soeone clever we may be able to build some linking of the Page:{{BASEPAGENAME}}/nn back to the Index, and any data known about the Index: page could be used to automatically populate the IA field. Just thoughts, happy to hear something cleverer. — billinghurst sDrewth 06:59, 14 November 2019 (UTC)

@Billinghurst: I'm not quite following your reasoning, but, yes, from a set of individual page images I can generate a DjVu with OCR text layer, regardless of where those page images came from. This is currently using hacky and semi-manual tooling that nobody but myself would ever use unless under duress, but I am investigating options for providing some kind of access to them for anyone to use in a way that is at least reasonably functional for normal people. In the mean time I am happy to generate DjVus for people if I am provided with a comprehensible specification of what page images in what order should make up the resulting DjVu. I can also do things like swap out a page in an existing DjVu, reorder pages in an existing DjVu, delete extraneous pages, etc., and am happy to do so, but, again, provided I get a clear specification of what needs to be done.

As for your larger thrust… I don't think I'm grasping the problem you are aiming to solve?

If we presume a correctly filled out Book template at Commons, the Index:-page preloader gadget can be extended to pull in the source link from there (in fact I think it already does, we just don't store it). The Index: template here can be extended to have a field to store the value from the source field at Commons. And it's possible to make a script that tries to pick out an IA link from that and generate a direct link to the "show all files" directory listing at IA. It is not possible to create a link directly to an individual page image at IA since their page images are arbitrarily named, and because we routinely make changes to works between IA and what's uploaded to Commons (think removing Google scan pages, calibration pages, duplicate pages, etc.). It is also technically possible, today, to make a script to go directly from a page in the Page: namespace to the directory listing at IA. Some or all of these will be somewhat hacky and prone to break, but that's already the case with the Index: preloader gadget and it seems to work enough to be worthwhile. *shrug*

In any case, lots of things are possible in this area, so it's mostly a matter of articulating which problem we are trying to solve. --Xover (talk) 08:08, 14 November 2019 (UTC)

Problem 1: {{raw image}} was previously used by Hesperian as an indicator to populate converted jp2 images as png images, this upload locally stopped a while ago due to time and effort. And users had to download the PNG and clean, then we have to go through a migration and deletion process. All butt ugly.

Info Template:raw page scan (transclusions: 21,218, links: 5) / Template:Raw image (transclusions: 12,659, links: 21,262)

Problem 2: people have used the jpg images from (expanded) scans at IA as the basis of an extracted images to upload to commons, or as an ugly screenshot to upload. All butt ugly.

(solution to P1 and P2) Links to the folder enables users to at least try and to get best available quality.

Problem 3 There are broken scans here, and often people haven't fixed them as it was too hard to extract from a djvu, or get a source page to OCR separately.

(solution to P3) new source of single page to insert into djvu, or new source of single page image to OCR online and insert; was flagging nothing more

Problem 4 While scan in file has been good, the OCR has been rubbish

(solution to P4) as per S3, can OCR individual page for paste of text

Re general comment: increasing our general connectivity in through Wikidata<->Commons was part of the discussion earlier on this page—samilson's script discussion above—and to IA is more helpful, sure there will be old data, and occasionally broken data, though such a process as this is more likely to find and get fixed. I am advocating that we keep taking these steps.

Re book => index. We haven't looked at it as a community, and Jarekt has been better developing it at Commons, and we should review how we utilise the links, the code or the data, at the moment we scrape data, rather than leverage the available sources, and then only complete fields when we need to override.

Re linking, it looks as direct linking is possible, eg. [4] though I was more advocating linking to the directory. — billinghurst sDrewth 09:00, 14 November 2019 (UTC)

@Billinghurst: Thanks, I'll try to see if I can come up with anything useful.

Regarding the direct linking, the problem isn't what IA provides, it's that we have no way to figure out which page image a Page: here corresponds to at IA. On the IA side, some scans count pages from zero, some from one; some include Google book pages, calibration pages, etc. that have been removed before upload to Commons, meaning our page 123 maybe be page 134 at IA. In other words, there's no way for a mere dumb computer to get from one to the other: you need a human being to connect the two. That said, there are things we can do to encourage the humans to add such links if we want them to: the {{raw image}} template can start by asking for an IA identifier if missing, and progress to link to the directory listing if one is provided, and also ask for an IA page identifier that will enable the direct link. --Xover (talk) 09:13, 14 November 2019 (UTC)

I have started a conversation at template talk:raw image though the work is done in module:rawImage which eliminates me from the fix, though maybe not all the grunt work needs to take place in the module. I have also noticed that we give guidance at Help:Adding images and that is part of the above problem. — billinghurst sDrewth 10:24, 14 November 2019 (UTC)

@Xover: if Djvu files are ported from IA leaving unchanged the internal 'page id', deletions, etc. should not cause problems. Inspecting the local djvu file, we could get the correct IA djvu page. This is not true if new 'page ids' are used when regenerating djvus from IA. This at least could allow offline scripts to work. Would be nice to have this info through an API command wishful thinking, I know ....)Mpaa (talk) 19:38, 14 November 2019 (UTC)

@Mpaa: Hmm. Interesting. I hadn't realised IA did that. The 'page ids' aren't actually identifiers as such, they're a "page name" and were, I believe, intended to be used essentially like our pagelist tag. I've been avoiding using them because they make it confusing when trying to manually manipulate a DjVu file (the DjVuLibre commands operate on physical page numbers, but DjView displays the "page name"; if the two differ you get seemingly random results). However I hadn't considered the possibility of using them to document the original page image from which the DjVu page was generated. I'll play around a bit when next I touch that code and see if there's anything clever we could do there. --Xover (talk) 19:49, 14 November 2019 (UTC)

For completeness, I mean this sort of info, e.g, <PARAM name="PAGE" value="whofearstospeako00cuma_0001.djvu"/> in here. I always try to leave that unchanged. Then we just need to play with the extension. I have seen bugs, e.g. the page offsets sometimes we get when uploading, related to changing these references.Mpaa (talk) 21:20, 14 November 2019 (UTC)

Work-specific disambig pages

A while ago, we agreed that it does not make sense for us to have author-specific disambiguation pages. For example, Sonnet (Shakespeare) should not exist as a disambiguation page, but instead all works by Shakespeare titled "Sonnet" should be listed directly at Sonnet and at Author:William Shakespeare.

I've noticed that we also have a number of work-specific disambiguation pages. For example, 1911 Encyclopædia Britannica/Abdera lists works entitled "Abdera" which are also part of the Encyclopedia Britannica. However, this page is redundant, as the works listed on that page are listed directly at Abdera and at 1911 Encyclopædia Britannica/Vol 1:1.

I would like to start merging these work-specific disambiguation pages into the main disambiguation pages, but I also want to get the community's input before I start. This also ties into my efforts to clean up the Wikidata items for Wikisource mainspace disambiguation pages. —Beleg Tâl (talk) 13:34, 15 November 2019 (UTC)

I would say

that the page "1911 Encyclopædia Britannica/Abdera" should redirect to the general disambiguation page for "Abdera" with merging of detail as required.

Philosophically we have agreed

one disambiguation page per term
where disambiguation contain main and other namespace items, then main namespace wins for siting
disambiguation pages can exist in any portal to disambiguate within a portal (above rules apply first)

While not desirable, I don't have a particular concern if we have work level disambig pages and nothing at root level where not attached to a WD item—to me they are low priority. That said, we should not have any work level disambiguation pages linked to WD, be it DB1911, DNB or whatever, and creation of further work-level pages should be dissuaded.

We need to ensure that is suitably covered in Help:Disambiguation

— billinghurst sDrewth 14:17, 15 November 2019 (UTC)

Signatories

Are signatories considered to be a kind of authors, i.e. can they have author pages even if no other work by them eligible for Wikisource exists? --Jan Kameníček (talk) 16:03, 13 November 2019 (UTC)

One more related question: If a person already has their author page, can I add there (under a separate heading) works which they have only signed? --Jan Kameníček (talk) 18:12, 13 November 2019 (UTC)

I think it depends - if they are just one of a whole bunch of signatures e.g. on a petition, I wouldn't bother creating an author page for them (but I might add it under a separate heading on an existing author page as you suggest). On the other hand, a situation like e.g. an official signing a document that was issued in their name but written by one of their staff, I would definitely treat the signing official as a full author. —Beleg Tâl (talk) 19:04, 13 November 2019 (UTC)

I see. What I had in mind was e.g. an international treaty signed by a bunch of statesmen, which is similar to your petition example. --Jan Kameníček (talk) 20:55, 13 November 2019 (UTC)

I would have just wikilinked it unless they are primary. It is one of those quandaries about how do we work with WhatLinksHere and running counts on those things to highlight where an author has exposure beyond the works they wrote. — billinghurst sDrewth 00:35, 14 November 2019 (UTC)

@Billinghurst: I see. So you do not think it should be mentioned at the author page if it is not a primary signature, right? --Jan Kameníček (talk) 13:24, 16 November 2019 (UTC)

I don't see how it is different from being mentioned/appearing in any article that we reproduce. I wikilink to their author page, and would only backlink back to the article where they are the focus. — billinghurst sDrewth 13:55, 16 November 2019 (UTC)

Proposal for a new Featured texts badge on Wikidata

I've created a proposal on Wikidata for a new "Featured texts" badge, to compliment the existing "Featured article", "Featured list", and "Featured portal" badges used by the Wikipedias. If you have an opinion, please comment there (not here). Thanks. Kaldari (talk) 18:27, 15 November 2019 (UTC)

We already use the featured article for featured text (aliases at d:Q17437796). I think we felt that the words are interchangeable, and there is no link overlap issues. The Vampyre <-> d:Q58881954, if it isn't automatically appearing, that is our fault for not properly converting {{featured}} to properly leverage the tag.

We also need to better align d:help:badges of proofread, validated and digital document, as we should be building that into our {{header}} template. I note that there is a separation of Wikisource badge and Wikimedia badge. — billinghurst sDrewth 02:35, 16 November 2019 (UTC)

I have already brought up this topic on the proposal page itself. —Beleg Tâl (talk) 19:00, 16 November 2019 (UTC)

Tech News: 2019-47

Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.

Problems

You will be able to read but not to edit some wikis for up to 30 minutes on 26 November at 06:00 (UTC). You can see which wikis. It will probably last much shorter than 30 minutes. This will also affect the centralauth database. This could for example affect changing passwords, logging in to new wikis, changing emails or global renames. [5]

Changes later this week

You can soon vote on proposals for the Community Wishlist Survey. The survey decides what the Community Tech team will work on. You can vote on proposals from 20 November to 2 December. This year the wishlist will focus on Wikibooks, Wiktionary, Wikiquote, Wikisource, Wikiversity, Wikispecies, Wikivoyage and Wikinews. You can read more about the format for this year.
There is no new MediaWiki version this week.

Meetings

You can join the technical advice meeting on IRC. During the meeting, volunteer developers can ask for advice. The meeting will be on 20 November at 16:00 (UTC). See how to join.

Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.

20:16, 18 November 2019 (UTC)

Duplicate works

Are The History of the City of Fredericksburg, Virginia and Fredericksburg, Virginia 1608-1908 the same work? They appear to be, although they have different authors (Silvanus Jackson Quinn versus Sylvanius Jackson Quinn). There is a scan on IA here (under the title The history of the city of Fredericksburg, Virginia), which appears to show the same. If so, should they be connected? The former a redirect to the latter, since it is complete? Could someone match-and-split to the scan? TE(æ)A,ea. (talk) 22:44, 21 November 2019 (UTC).

Looks the same to me, definitely should move the complete one to scan and change other to redirect —Beleg Tâl (talk) 01:22, 22 November 2019 (UTC)

horizontal TOC -- Template request

I have a long TOC at Translation:Likutei_Halakhot/Orach_Chayim/Early_Rising that I would like to make horizontal. The Wikipedia templates listed at w:Template:Horizontal_TOC namely {{horizontal TOC}}, {{horizontal TOC|nonum=yes}} etc. do not work here. Could those be made available here? Thanks! Nissimnanach (talk) 00:51, 22 November 2019 (UTC)Nissimnanach

@Nissimnanach: you could use __NOTOC__ to hide the current TOC, and manually create a horizontal one using {{Empty TOC}} —Beleg Tâl (talk) 01:24, 22 November 2019 (UTC)

@Beleg Tâl: I'll remember that but is there a good reason why the horizontal TOC template cannot be available here? I think it will save me effort and time and I'm lazy to manually do w:seds or Replaces and afraid of making mistakes. Nissimnanach (talk) 13:11, 22 November 2019 (UTC)Nissimnanach

@Nissimnanach: We generally prefer not to import general-purpose templates for one specific work, especially when we have other templates that work just as well (like {{Empty TOC}}). I'm hesitant to introduce an unwieldy and not-very-useful template like w:Template:Horizontal_TOC, but I would be more than willing to assist you with seds or replaces if you don't want to do them yourself. —Beleg Tâl (talk) 13:54, 22 November 2019 (UTC)

I've applied {{TOC limit}} for now - see if you think that's better. That said, as the page is currently 182,380 bytes long, it may be better to split each level-2 section to a sub-page. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:36, 22 November 2019 (UTC)

Wikisource is sixteen years old: let's celebrate!

Wikisource is 16 years old!

Dear friends,

In order to celebrate merrily the sixteenth birthday of Wikisource, the it.source community revamped the proofreading contest that since the last six years has gathered hundreds of jolly good proofreaders! The main page is at

Wikisource:Sedicesimo compleanno di Wikisource. (in Italian, sorry)

We invite during two weeks users to validate pages and award three of them with tens of euros to spend in books: visit the contest page for more details or ask me for them when in doubt.

This year we have also texts in Neapolitan, Venetian, Ligurian, Dolomitic Ladin and Lombard!

If you think that this announcement is worth sharing.... well, spread the news! :D

- εΔ ω 09:59, 24 November 2019 (UTC)

Time to vote for the Community Wishlist 2020

It is time to vote for 2020 Community Wishlist Survey. Vote ends December 2nd. Wikisource has 28 proposals.

The proposal m:Community Wishlist Survey 2020/Wikisource/Improve export of electronic books is back this year. The tech team worked on it last year but had to work on other proposals. So we are many to think that the work should be pursued and completed.

There are also many good proposals. --Viticulum (talk) 16:37, 25 November 2019 (UTC) from the French Wikisource.

Tech News: 2019-48

Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.

Changes later this week

The mobile beta mode will be disabled to have less maintenance. The developers will focus on the desktop improvements project. You can turn on advanced mobile contributions mode if you want to see the categories. You could also jump back to the top. This can instead be done with a gadget or user script. [6]
Parsoid is software we use for the visual editor, content translation, Flow and the Android app. This has been rewritten. It will come to the wikis gradually over the next two weeks. It has been tested, but there could be some diffs or previews that don't look right. If you see any you can report them. [7]
The new version of MediaWiki will be on test wikis and MediaWiki.org from 26 November. It will be on the other wikis next week (calendar). This is because of holidays.

Meetings

You can join the technical advice meeting on IRC. During the meeting, volunteer developers can ask for advice. The meeting will be on 27 November at 16:00 (UTC). See how to join.

Future changes

You will switch between the article and the talk page in a new way in the mobile view in the future. It will use tabs. This is more like in the desktop view. [8]

Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.

16:51, 25 November 2019 (UTC)

Category muddle: cultural events & traditions

At present, there is a thoroughly confused organization of categories relating to traditional observances, holidays, rites, and collective activities in various cultures. Category:Cultural events contains a mixture of all of those, and was apparently created solely to organize EB1911 articles. It is currently a subcategory of Category:Culture. It seems like some of that could go in Category:Traditions, but that currently contains only Category:Observances and Category:Holidays, "Holidays" being also a subcategory of "Observances;" if there’s a meaningful distinction between "Observances" and "Holidays" I’m not seeing it. How should all this be re-organized? I think some natural groupings might be 1. cultural traditions related to the rites of life such as weddings and funerals; 2. Festivals and holidays with their own history and traditions: Lupercalia/Easter/Arbor Day; 3. articles about specific collective activities considered as a cultural phenomenon, e.g. gladiatorial games and the English country fair. The first two could be grouped under Observances and called respectively Rites and Holidays; the third could be just a subcategory of Culture but I don't know what to call it. Is there also a need for a Traditions category to contain something-or-other that isn’t in those three? Levana Taylor (talk) 17:31, 25 November 2019 (UTC)

On Wikipedia, w:Category:Holidays are official designated days of observance and are a subcategory of w:Category:Observances. Observances that are not categorized as Holidays include w:Category:Anniversaries. On Commons, however, commons:Category:Observances is a subcategory of commons:Category:Holidays. Ultimately it probably doesn't matter. —Beleg Tâl (talk) 04:12, 26 November 2019 (UTC)

Thinking about it further, I don't think I would lump groups 1 and 2 together. Festivals/holidays/observances are a kind of cultural event, and so are weddings/funerals/birthdays (and so are fairs/sporting events/concerts/etc), but they aren't really the same kind of thing at all. I'd toss them all under Category:Cultural events with maybe a subcategory or two for group 2, any more than that is probably unnecessary.

Publishers' terminology

Question seen on Facebook:

When the publishing information is on a page that precedes the title page, is there a name for that page? Like title page verso when it's the back of the title page. It happens frequently with juvenile picture books.

Anyone know? Do we have a glossary of such terms? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 22:30, 26 November 2019 (UTC)

With a picture? I think "frontispiece" might be the term you're looking for? A glossary would be useful, but I don't think we have one. -Pete (talk) 23:38, 26 November 2019 (UTC)

No, without a picture. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 00:07, 27 November 2019 (UTC)

Do you mean a "copyright page"? These have seem to have wandered over the years https://www.thebookdesigner.com/2009/09/parts-of-a-book/

sometimes you see colophon, but it’s greek to me. front matter varies a lot depending on date and publisher. Slowking4 ⚔ Rama's revenge 17:54, 28 November 2019 (UTC)

Author:Ferdinand_Moeller

The following discussion is closed:

resolved

I updated this author's page with a middle initial. Should the page be moved to Author:Ferdinand A. Moeller even if the name isn't completely filled out? —Crocojim18 (talk) 01:33, 5 November 2019 (UTC)

Looks like it has already been moved. —Beleg Tâl (talk) 20:49, 6 November 2019 (UTC)

This section was archived on a request by: --Xover (talk) 09:55, 29 November 2019 (UTC)

Abuse filter edit request

The following discussion is closed:

Declined. The noise in the logs from the filter is intended behaviour.

Hi. Can Special:AbuseFilter/36 please be tweaked to also exclude bots? When bots execute mass moves, they flood the log. Thanks, --DannyS712 (talk) 06:34, 2 November 2019 (UTC)

The purpose is to capture such moves where there is the potential for remaining redirects, so it is acting within scope of why I programmed it. As such it is recording what I want to see, so I am not considering it flooding the logs. — billinghurst sDrewth 06:40, 2 November 2019 (UTC)

@Billinghurst: my apologies, I thought it was for tracking misguided moves. However, bots also have suppressredirect, so if redirects aren't needed, wouldn't they be suppressed? Either way, thanks for explaining --DannyS712 (talk) 06:43, 2 November 2019 (UTC)

Yes it is its primary, though it is broader for checking, and also for clean up. It is not automatic to not create redirects, and there is no clear means to detect that no redirect has occurred, so it is a checking process. It doesn't happened that often, so I am not concerned about the few occasions that it occurs, it never truly floods the logs. Most bot moves usually occur early on, so it hasn't been problematic over the years. — billinghurst sDrewth 07:54, 2 November 2019 (UTC)

This section was archived on a request by: Xover (talk) 13:58, 29 November 2019 (UTC)

Index:Canadian Singers and Their Songs.djvu

The following discussion is closed:

The momentarily misplaced file has reappeared. :)

Tracked in PhabricatorTask T8071

What has happened to the above Index? It was here when I was working on it yesterday (Sat 2 Nov) in the morning Australian time. It now says Error: No such file. It was nearly proofread. --kathleen wright5 (talk) 07:06, 3 November 2019 (UTC)

@Kathleen.wright5: the file was deleted, per c:Commons:Deletion requests/File:Canadian Singers and Their Songs.djvu. As a result, the work here may need to be deleted too --DannyS712 (talk) 07:37, 3 November 2019 (UTC)

No. The file was apparently moved here. @Beleg Tâl: will better placed to advise where it arrived. Beeswaxcandle (talk) 07:48, 3 November 2019 (UTC)

(edit conflict) @DannyS712: It was published before 1924 so it is in the public domain in the US (rule of thumb: US term of protection is 95 years from date of publication), and enWS policy is that works must be public domain in the US (vs. Commons that requires PD in both US and country of origin). @Beleg Tâl: On Commons you indicated that you had transwikied it here, but I can't find it. Can you look into it? --Xover (talk) 07:58, 3 November 2019 (UTC)

very sad they should delete an entire compilation based on the dod of a single septuagenarian. but work can continue here when the promised transfer occurs. deletion on commons should never be a deletion rationale here; rather we should have our independent task flow and determination.Slowking4 ⚔ Rama's revenge 12:49, 3 November 2019 (UTC)

@Xover: @Beeswaxcandle: I did import the file (or thought I had done so). You can see that File:Canadian Singers and Their Songs.djvu is not redlinked, and does contain the licensing info I set up, so I'm not sure why the file itself is not there also. Fortunately, I can easily re-upload it from the source, and will do so as soon as I have a chance (probably later this evening). @Slowking4: we did have this discussion here, the work is unambiguously copyrighted in Canada and in violation of Commons policy, and I did (try to) move the file locally as part of our independent task flow and determination. —Beleg Tâl (talk) 21:13, 3 November 2019 (UTC)

@Beleg Tâl: You imported the File: page (the container), not the media:. You cannot special:import media files. — billinghurst sDrewth 21:34, 3 November 2019 (UTC)

as we see deletion is privileged, and saving by transfer is not. it is not obvious that it was a copyright vio since the nominator did not do the work of listing the authors. i guess that is the uploaders job, or the person transcribing here, otherwise we might have work after work deleted out from under a transcription effort. look forward to the required local upload from IA, since fairusebot is a distant memory. Slowking4 ⚔ Rama's revenge 02:52, 4 November 2019 (UTC)

That's lame. Looks like it's already being looked at on Phabricator, phab:T8071. —Beleg Tâl (talk) 14:29, 4 November 2019 (UTC)

I've also added it to the wishlist. —Beleg Tâl (talk) 14:46, 4 November 2019 (UTC)

This index seems to be here in some form. I've just validated Page:Canadian Singers and Their Songs.djvu/124 and it was proofread by Jason Boyd earlier today. [Revision history https://en.wikisource.org/w/index.php?title=Page:Canadian_Singers_and_Their_Songs.djvu/124&action=history] --kathleen wright5 (talk) 02:41, 4 November 2019 (UTC)

I have uploaded the file, everything is

Done —Beleg Tâl (talk) 14:29, 4 November 2019 (UTC)

This section was archived on a request by: Xover (talk) 14:02, 29 November 2019 (UTC)

Automatically pull text status from Wikidata badges and display on main page of work

Per the discussion below at Wikisource:Scriptorium#KaldariBot, Sam Wilson and myself have set up a sandbox version of the header template that automatically pulls the text status of a work from its associated Wikidata item (based on the badge assigned to the sitelink) and then displays the appropriate indicator icon(s) in the upper-right part of the page and assigns the page to the appropriate category. (The icon and category are both taken from the properties of the badge's Wikidata item and are thus easily configurable.) To see some live examples of this, take a look at The Life of the Spider and The Riverside song book/The Open Window. Note that the template currently only pulls Wikisource badges (and thus not "featured" status since we are piggy-backing on Wikipedia's featured article badge), but this could easily be changed (See this and this). Please indicate below whether you think we should apply this functionality to the main {{header}} template, and thus to all works on Wikisource. Kaldari (talk) 23:42, 19 November 2019 (UTC)

@Billinghurst, @Mpaa, @Beleg Tâl: ^. Kaldari (talk) 23:43, 19 November 2019 (UTC)

@Kaldari: fantastic work! Is there a reason why it checks for mainspace in the invocation? I ask out of curiosity, because {{header}} should only be used in mainspace anyway, so hardcoding that restriction into only the one component seems unnecessary —Beleg Tâl (talk) 23:58, 19 November 2019 (UTC)

No reason other than being overly cautious. I'll remove the restriction. Kaldari (talk) 00:10, 20 November 2019 (UTC)

I think it'd be terrific to have badges displayed more prominently here, so that maybe they get used more and so it's easier to query for works that are fully validated. Another thing I've been wondering about is digital document (Q28064618): do we have a category for these here? We should add it as topic's main category (P910). —Sam Wilson 10:01, 20 November 2019 (UTC)
Yes, it is a very good idea, thank you for introducing it. I also support the more prominent display of the badges. The text that appears when you hover over the badge (currently: "Help:Text status") might also be changed and tell the reader directly which status it is. --Jan Kameníček (talk) 10:58, 20 November 2019 (UTC)
What is digital document (Q28064618) and why is it useful? Wouldn't every single item on this site fall into such a category? —Beleg Tâl (talk) 11:50, 20 November 2019 (UTC)
- It's used to mark w:born-digital works, which might have a perfect text layer and not need proofreading, and for which the 'original' is of less importance (i.e. multiple copies are identical). Its history is described in phab:T153186. —Sam Wilson 23:39, 20 November 2019 (UTC)
Question: For Featured texts, would we want to do the same? I'm hesitant to store our badges off-site because it means that we won't have notification if a text's status is changed remotely. For fully validated works, I can see a simple means to double-check with a bot. But what would this mean for our Featured texts? --EncycloPetey (talk) 16:49, 20 November 2019 (UTC)
- It's possible to show Wikidata changes on your watchlist here. I know that can be annoying sometimes, because there can be lots of edits, but it does make it easier to catch changes to badges and other metadata, without having to go to Wikidata. —Sam Wilson 23:39, 20 November 2019 (UTC)
  - But does that mean we have to rely on editors here watching certain pages over there to catch this? What happens when membership here changes, and no one is watching those pages any longer? You've indicated that this is a thing which is possible, but is it advisable to do it that way? --EncycloPetey (talk) 00:30, 22 November 2019 (UTC)
    - @EncycloPetey: Nope, it's easier than that: for any page you watch here, changes to its Wikidata item will appear in your watchlist, regardless of whether you watch the page over on Wikidata or not. —Sam Wilson 03:23, 22 November 2019 (UTC)
      - Cool! Does this require setting a preference? -Pete (talk) 05:08, 22 November 2019 (UTC)
        @Peteforsyth: It can be enabled as a preference (in the Watchlist section of Special:Preferences) but it can also be turned in directly from the watchlist page (and saved in a watchlist filter, if desired; I have it enabled in my default filter). It's also available for Special:RecentChanges. —Sam Wilson 05:53, 22 November 2019 (UTC)
  - @Samwilson: That only answers the first part of my concern, not the second part. Doing things this way means that the only way we can keep track is if someone continuously active here has those page on their watchlist, maintains constant vigilance, and never leaves. --EncycloPetey (talk) 05:45, 22 November 2019 (UTC)
    - @EncycloPetey: But that's true of pages here as well, isn't it? If a page here isn't on anyone's watchlist, then changes to it are likely to not be noticed. And with the categorization feature, badged pages will be added to relevant categories and so people watching those categories can see when things come and go. —Sam Wilson 05:53, 22 November 2019 (UTC)
      - No, that's not true. The difference is that, under our current procdures, anyone watching Recent Changes here can spot that kind of edit if it happens locally, and does not have to have special pages in their Watchlist. By contrast, the proposal would require current editors to add a specialized set of pages to their Watchlist and would require whoever is monitoring to remain active and vigilant in perpetuity, or at least perpetuate through other monitors. And if the person who has these pages on their Watchlist is away for a week, they might miss such changes. --EncycloPetey (talk)
      - Agreed, this is true of everything here. But there is an important point here. Currently featured texts are always protected, so admins always know if {{featured}} is removed from a page regardless of whether anyone is watching the page. We can't protect Wikidata items in the same way. I think we would need a bot to handle it, periodically checking that Wikidata badges are correct and fixing them when they are not. (Such a bot would also be useful to ensure the proofread status badge correctly matches the Index status field, at least until the Index status field is updated to store the info directly in Wikidata.) —Beleg Tâl (talk) 13:42, 22 November 2019 (UTC)
        But, as with having a Watchlist, it requires someone to have a bot and to use it regularly. This is an added specialist maintenance task. I've been involved in wikis when the one person with the necessary bot stopped editing, or stopped running the bot, or the community lost access to the bot. --EncycloPetey (talk) 16:44, 22 November 2019 (UTC)
        @EncycloPetey: My initial proposal was to manage all of this through specific Wikisource templates rather than using Wikidata (similar to how {{featured}} works). However, Beleg Tâl thought using Wikidata would be a better idea. Frankly I don't care which method is used, my only goal is to make validated texts discoverable. Perhaps I should run a poll to find out which option has more support. Kaldari (talk) 19:25, 22 November 2019 (UTC)
        Featured texts are a bit different from Validated texts.—The primary indicator of a work's Featured status is the presence of {{featured}} on the work page itself. Thus the addition of the badge and the conferring of Featured status are actually the same thing.—The primary indicator of a work's Validated status is the Progress field of the transcluded Index page. Copying this status manually from the Index page to a template in Mainspace is the sort of task which is invariably ignored by most editors and generally leads to a massive and ever-increasing backlog. Fortunately there is a tool whose entire purpose is to allow metadata to be centrally stored so as to prevent this exact issue; that tool is of course Wikidata. And as it happens, Wikidata already has a system set up to store this exact metadata in the form of badges.—The automatic sync between Index page and Wikidata does not yet exist, so the backlog I spoke of can be seen by observing how few Wikisource texts have Wikidata items with status badges (i.e. almost all of them).—Creating a bot to sync the data between Index pages and Wikidata, and then pulling the data directly from Wikidata, is exactly how Wikidata is intended to be used. Having a bot sync the data between Index pages and a third location (i.e. mainspace) is re-inventing the wheel that is already in place in the form of Wikidata, and it still leaves the Wikidata backlog unaddressed —Beleg Tâl (talk) 19:54, 22 November 2019 (UTC)
        
        @Kaldari: Your initial proposal was only in regard to Validated works, which is a totally different thing from Featured works, as Beleg Tâl has explained above. The Validated works have an unprocessed backlog that is constantly growing, and that is worth addressing with some innovative solution. The Featured texts have no such backlog, and produce new items at a maximum rate of one per month. Creating a bot-reliant monitoring process that depends on outside data storage for the Featured texts is using a particle collider to open a peanut. --EncycloPetey (talk) 21:01, 22 November 2019 (UTC)
        @EncycloPetey: Got it. So you would prefer that text statuses like "validated" and "proofread" be handled with bots and Wikidata (which I'm happy to implement), but that "featured" continue to be handled manually with a template? If so, that's fine with me. Kaldari (talk) 22:24, 22 November 2019 (UTC)
        Yes, at least for now, based on the way cross-wiki data is handled. --EncycloPetey (talk) 00:24, 23 November 2019 (UTC)
        I agree with this solution too. I am also not convinced that anybody notices if somebody changes the status at Wikidata. Wikidata antivandal protection is very low measuring it by our standards and although it is possible to turn on showing the Wikidata changes here, my experience is that quite few people do it and the chance such a change gets through unnoticed is high. --Jan Kameníček (talk) 11:52, 23 November 2019 (UTC)

@EncycloPetey, @Jan.Kamenicek: I have withdrawn my proposal on Wikidata per your feedback. Do you have any further concerns with moving this forward? Kaldari (talk) 18:21, 24 November 2019 (UTC)

I have no other objections. Thank you for introducing the badges! --Jan Kameníček (talk) 20:09, 24 November 2019 (UTC)

Just one more detail: Above I suggested to change the text that appears when you hover over the badge, but I do not know whether my idea has been rejected or just unnoticed. Currently the text says just "Help:Text status". I suggest to replace this text with the status itself, e. g. "proofread" or "validated". --Jan Kameníček (talk) 20:20, 24 November 2019 (UTC)

@Jan.Kamenicek: Great suggestion. I'll see about implementing that. Kaldari (talk) 18:13, 25 November 2019 (UTC)

@Jan.Kamenicek: I've implemented your suggestion in the sandbox template. See The Riverside song book/The Open Window for example. Kaldari (talk) 21:00, 26 November 2019 (UTC)

Perfect, thanks. --Jan Kameníček (talk) 21:44, 26 November 2019 (UTC)

No, I have no other concerns besides Featured texts. If the community is fine with marking Validation status on Wikidata, then I back that as well. --EncycloPetey (talk) 21:55, 24 November 2019 (UTC)

Just waiting for someone to make the change at {{header}}. Kaldari (talk) 18:37, 28 November 2019 (UTC)

This is done now. Kaldari (talk) 22:02, 3 December 2019 (UTC)

Bulk replace

Do we have a tool to do find'n'replace across all the pages in a work? For example, to change all instances of {{frac}} to {{sfrac}}? Or do I need to find a willing bot operator for such things? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 16:48, 20 November 2019 (UTC)

@Pigsonthewing: That would be a bot request. --Xover (talk) 19:22, 20 November 2019 (UTC)

Now sorted by DannyS712, using AWB. Thanks, all. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:11, 9 December 2019 (UTC)

This section was archived on a request by: Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 17:11, 9 December 2019 (UTC)

Proposed changes to WS:WWI regarding advertisements

The following discussion is closed:

The proposed changes have been implemented.

There is a proposal to update the wording of our policy regarding the inclusion of advertisements, in particular advertisements that are part of a larger transcluded text. Please see the discussion at Wikisource talk:What Wikisource includes#Proposed changes to Advertisement section. —Beleg Tâl (talk) 13:50, 15 November 2019 (UTC)

This is more of a clarification than a change of policy. The previous instructions were very vague and confusing. Kaldari (talk) 23:27, 18 November 2019 (UTC)

This section was archived on a request by: Xover (talk) 08:48, 14 December 2019 (UTC)

Once a Week Vol. 7

The following discussion is closed:

resolved

Vol. 7 of Once a Week is missing pages 629 and 630. Of the two scans at IA, this is the one that has all the pages. Could someone please use it to repair the Djvu, or replace the whole thing if necessary, since the complete scan is fairly decent quality. Levana Taylor (talk) 22:32, 23 November 2019 (UTC)

@Levana Taylor: Done. Sorry about the delay. --Xover (talk) 13:50, 29 November 2019 (UTC)

This section was archived on a request by: Mpaa (talk) 20:37, 21 December 2019 (UTC)

How shall I transcribe two books in one?

The following discussion is closed:

Resolved.

I have started working on a publication of engravings by Wenceslaus Hollar. The book does not contain the year of publication, but HathiTrust states that it was published between 1794 and 1812. The book looks like a reprint of originally two separate books, one published in 1640 and the other in 1643. The problem is that this reprint does not have one title common for both parts.

Can I transcribe the publication as two separate works under their individual titles? Or should I transcribe them as one work and devise some title? I was considering using the first of the titles for the whole publication, but it would be really misleading, as it speaks only about England, while the other part deals with various European countries. --Jan Kameníček (talk) 19:50, 22 November 2019 (UTC)

I'd just transcribe them as two separate works, if there's no overall introduction or anything.--Prosfilaes (talk) 21:04, 23 November 2019 (UTC)

I also think it is the best solution, but I wanted to have it confirmed by somebody else. Thank you very much. --Jan Kameníček (talk) 21:25, 23 November 2019 (UTC)

I, on the other hand, would probably transcribe them as one work and devise some title, like I did with The Holly & the Ivy, and Twelve Articles and Lyra Ecclesiastica —Beleg Tâl (talk) 21:30, 23 November 2019 (UTC)

Hm, simple connection of two titles with "and" could also be a solution. I’ll think about it for a while, thanks as well. --Jan Kameníček (talk) 23:20, 23 November 2019 (UTC)

I don't think there is a clear answer in general; this sort of thing needs a judgement call for each work, and with quite some leeway for individual contributor preference. It also needs to be considered whether the book in question is actually a publication and not merely two works bound together (as was common practice for collectors of all stripes in the 18th and early 19th century). And on this particular book the fact the two works have the same publisher might suggest they are one publication, while the fact both included works have separate colophons suggests they are independent publications bound together. Similarly, there appears to be no front or end matter that is common to both works: they share only the binding. It is hard to be categorical, but I suspect I would have eventually landed on treating these as separate works that had merely been bound together. But I would not have faulted anyone for landing on the opposite.

Incidentally, the publishers, “Laurie & Whittle”, are still around, trading these days as “Imray Laurie Norie & Wilson Ltd”. --Xover (talk) 08:32, 24 November 2019 (UTC)

I know of a number of examples where works more or less related were packed into one binding out of publishing constraints. I think that we should make sure that sure separate parts are separated out, like they would be in an anthology or magazine, and make them available individually, even if they are under a higher level heading for the complete work.--Prosfilaes (talk) 02:00, 27 November 2019 (UTC)

Which has been done by creating redirects at the root where they have been displayed as subpages. Where they have a set of known publishing components, especially with regard to how they are portrayed at Wikidata, then keeping to the known truth is best. Here the provenance of the work is simply not known, we just know that they shared the same binding.

We know that many of our works were singly published, serially published, and multiply published, so do what makes most sense that maintains the credibility of the publication/work(s). Document it well either in notes, or on talk page, so that someone can understand what you did when looked at in five years time. — billinghurst sDrewth 04:30, 27 November 2019 (UTC)

Thanks everybody for valuable opinions. I have considered them all and finally decided to keep them together (as the publisher enclosed them in common binding), but as two separate subpages and with explanation in the note. I think this solution shows that originally they were separate and at the same time it is faithfull to the intention of the reprint’s publisher. --Jan Kameníček (talk) 00:11, 8 December 2019 (UTC)

This section was archived on a request by: Xover (talk) 10:57, 29 December 2019 (UTC)

Wikisource:Scriptorium/Archives/2019-11

Transcription completeness and traditional vs. modern finding aids (i.e., tables of contents, indices, search engines)

NOINDEX meta tag

Wikilivres

Tech News: 2019-45

Community Wishlist 2020

Is posting to a website considered publication here just as it is in US copyright law?

Copyright in Ethiopia and Template:PD-Ethiopia

List of index pages

Tech News: 2019-46

New template for hyphenated words across pages

Spelling errors

If you tweet, especially about Wikisource

AuFCL / MODCHK / random IP editor 114...

Greek: Aerodynamics

Add Wikidata link to Index page

Ability to individually access single JP2 images from Internet Archive work archives

Work-specific disambig pages

Signatories

Proposal for a new Featured texts badge on Wikidata

Tech News: 2019-47

Duplicate works

horizontal TOC -- Template request

Wikisource is sixteen years old: let's celebrate!

Time to vote for the Community Wishlist 2020

Tech News: 2019-48

Category muddle: cultural events & traditions

Publishers' terminology

Author:Ferdinand_Moeller

Abuse filter edit request

Index:Canadian Singers and Their Songs.djvu

Automatically pull text status from Wikidata badges and display on main page of work

Bulk replace

Proposed changes to WS:WWI regarding advertisements

Once a Week Vol. 7

How shall I transcribe two books in one?

Navigation menu

Search