Fields transcibed from the original page
A number of people have asked why there is slightly more information on the original page than there is on the transcript.
When we transcribe the census, we transcribe everything on the original form except for the number of people in the house. The reason we do not transcribe the number of people in the house is that we do not believe that it is a particularly useful piece of data to include in the search engine (very few people would know this information, although arguably it could be useful for sociologists analysing the data in bulk). The reason for creating the transcriptions is simply to allow us to build a search engine which can analyse the most useful information provided in the original pages and provide results based on this to guide you to the original pages.
So the only other information that is included on the original page but not on the transcription is the number of living children born to the marriage, number dead and number of rooms in the house.
Again the reason we do not include this on the transcript is because we do not believe that this information is particularly useful as a search field and it is therefore excluded from the search options as well. All other fields are included on the transcript as they are all available as options in the advanced search.
The concept and purpose of the transcripts on the 1911census site (and indeed all findmypast.com historical records) is to act simply as a finding aid for the original page.
We always recommend that family historians (as all good historians should) rely on the original record wherever possible as the single definitive source of truth, and also the source of those extra details - not necessarily useful to search for as unlikely to be known in advance with anything approaching certainty, but potentially valuable for further research.
Tags: field, original, page, transcript
February 4th, 2009 at 6:13 pm
The simple fact is that the costs for the original pages are so high that I have only looked at a few. If you include information such as the number of years a couple have been married in the transcript the rest of the information should be there. This is simply a way of you getting more money!!
February 7th, 2009 at 3:01 pm
A transcript is a typed version of what is on the page for those unable to read the original, or for those who want the information on the page but do not require to see the original.
It is very disheartening to see you providing a list of selected information culled from your search engine and calling it ‘a transcript’. (Possibly breaking trade description legislation too)
For your own good, if not for the sake of those who have wasted money on this, can I suggest you at least describe it as ‘partial’ or ’selected’ transcript.
February 8th, 2009 at 5:19 pm
If the transcript is, as is said on the website, a typed version of what was on the page, then all information should be included.
The original images are exorbitantly expensive, so the only way the ordinary researcher can view all family is to rely on the transcript. It should be complete
February 12th, 2009 at 2:40 pm
A number of points i would like to take issue.
1) you state “The concept and purpose of the transcripts on the 1911census site (and indeed all findmypast.com historical records) is to act simply as a finding aid for the original page” WRONG!!!
The Purpse of an index is to locate the original page, a transcript should show all informtion on that page.
2) A massive thing is made of this census as being the fertility census i.e. No of children born to marrigae who are living/died etc.
Yet you decide to withhold this info from the transcript, which is possibly the most important bit of info that anyone wants to make sure they have accounted for all children, so to force everyone to pay for an original view at your exorbitant prices is just a plain rip off.
3) You also state “Again the reason we do not include this on the transcript is because we do not believe that this information is particularly useful as a search field and it is therefore excluded from the search options as well”
Wether a piece of info is useful to search on or not does not mean it should not be on the transcript. See Familysearch.org re 1881 census. They don’t allow search on occupation but is included in transcript
February 13th, 2009 at 1:15 pm
If the concept and purpose of the transcripts is to act simply as a finding aid then what is the concept and purpose of the search results?
If you are saying that we should rely on the original record as the single definitive source is it because the transcripts are flawed? If so why are we required to pay for them?
A blind person who relies on a screen reader and cannot access images would be sadly disappointed at what you are passing off as a transcript. Do you have an accessibility policy?
February 13th, 2009 at 4:31 pm
@Tim - indexes vs transcripts, strictly speaking you are correct, they are different. However the blog is written in a way to try to be accessible to all including newcomers and I didn’t want to confuse beginners with the difference between a transcript and index.
On what is included on the transcript vs the original record, we will have to agree to disagree. There is no specific industry standard on how much information should be included in a transcript vs an original record. Sites include more or less information as they deem appropriate and findmypast is one of the industry leaders in terms of how much we transcribe.
@Davey: it is not because the transcripts are flawed, it is because any historian (family or otherwise) should always seek to work from original sources. No transcript will *ever* be 100% correct nor can it be on a record set of this size. However we aim to move the transcript closer and closer to accuracy over time.
We do have an accessibility policy linked from every page - please see the link marked “accessibility” at the bottom of all pages.
February 14th, 2009 at 8:10 pm
A transcription service includes everything included on a page or document. there are many census indexes available, none of which claim to be transcriptions, because they aren’t, they are purely indexes, a finding aid. An index in a book is not the entire book repeated, its a guide to where to find specific chapters or info. If you choose to leave any information out for whatever reason, then it should be stated to be a partial transcription.
Thanks very much for the comment regarding screen readers Davey, you are spot on! Totally blind and deaf-blind are excluded from information that has not been transcribed. Although I was happy to be able to be involved in the Beta testing, this was one issue I sadly failed to identify. By the reply recieved on this subject, I don’t think giving feedback on it would have changed a thing anyway.
FMP seem to be very up and down on what is acceptable. Some of the censuses don’t have addresses and occupations transcribed, whereas on this site its deemed essential? Double standards? Having said that, its far better than Ancestry where apart from the 1881 Census, no other has addresses and occupations transcribed. And us good old blind folk have to pay exactly the same as anyone else, I estimate at least half of the Ancestry site is inaccessible to us.
Whether a certain piece of info is or isn’t necessary should be the choice of the person doing the searching, not the choice of the person or people doing the transcribing.
February 14th, 2009 at 8:15 pm
Just to clarify, a deaf-blind person wouldn’t use a screen reader, of course. I should have mentioned that they use electronic Braille displays, but just like screen readers they rely on text, not images.
February 15th, 2009 at 12:43 pm
I agree with Davey. If the purpose of the transcripts is purely to lead you to the original page, you should not be charging for the transcript. The whole thing seems rather unethical. You are charging for what is often an inferior product - only half the data and the transcription error rate is much much higher than you say.
February 16th, 2009 at 8:36 am
@David: both your claims are inaccurate: firstly, significantly more than half the data is transcribed. From the post, you can see what is not transcribed and this represents a very small proportion of the overall data.
In terms of the accuracy rate - this has been meticulously batch tested at all stages of the release - if you have any evidence to support a lower rate using a statistically valid method of randomly sampling the data, do feel free to share it.
@all: the site was tested in depth for accessibility issues prior to release and is accessible to the majority of those using text-based readers. However it’s not perfect and we will continue to improve it.
February 16th, 2009 at 4:08 pm
Ian. Why be so choosy about what you do and don’t transcribe? The only reason why you have left out some bits is so that people feel they have to go on and look at the original and pay more money. I agree that we should look at originals, but your costs are so high that for many of us this is not possible.
February 16th, 2009 at 7:59 pm
@David: the costs will be significantly more affordable when the subscription option launches on findmypast.com in the summer if you need to look at a reasonable number of records.
February 17th, 2009 at 11:41 am
Ian,
What is the degree of accuracy of transcription of the name fields. Can you quote a figure for those? Or is your quoted accuracy measured across all fields including those in which transcription errors are less likely to occur. i.e. what is the degree of keystroke accuracy in the fields that matter most?
Your accessibility page says.
“Images
Any content images used in this service include descriptive alt attributes. Purely decorative graphics include null alt attributes. The exception to this are the images of the original historical documents for which it is not possible to provide fully descriptive alt attributes. For a full description, users should refer to the household transcripts.”
How do you define “full description”? Would it not be better to change the wording to “partial description” and leave the blind in the dark rather than mislead them?
February 17th, 2009 at 12:17 pm
@davey - the figure quoted is across all fields. Reporting at a more granular level than this on a per-piece basis would be extremely expensive and time-consuming. Keystroke accuracy is unlikely to vary significantly by field. I will ask the editorial people to look at the accessibility statement.
February 17th, 2009 at 2:14 pm
Ian,
To state that you think there would not be significant difference between the accuracy of, say, an age field or relationship field, where the options are strictly finite, compared to a surname field where possibilties are (almost) endless is very strange. The error percentage will obviously be significantly skewed to a higher rate for surname fields, etc.
Also, if the “transciptions” are purely a finding aid why are they either (a) free (after all, no one would pay to buy the index of a book!!) or (b) part-pay towards the final image (i.e. 10 credits for the transcription plus another 20 when you get the image, but still 30 if you go straight to image).
February 17th, 2009 at 4:59 pm
@Chipp: I think you have misunderstood what I wrote. Keystroke accuracy is different from field transcription accuracy. Longer fields will obviously have a higher propensity to be incorrect as more characters are being transcribed.
The transcriptions are not free because this is a commercial service: the search (which uses transcribed fields) is the finding aid and this is free. The payment model is not going to be changed as it works for the majority of people and has been established and tested over a number of years across our other services (and those of many other providers).
February 17th, 2009 at 9:34 pm
Ian,
You misunderstood why I said the errors would be skewed towards fields such as Surname. I was implying that a field such has Relationship has a limited number of options (head, wife, son, daughter, etc), so it would be significantly easier for someone to make a correct guess at some difficult writing, whereas the combinations of surnames (for example) are so huge that the chances of guessing correctly are grately reduced.
However, your previous answer was enlightening. You seem to be saying that the error rate you quote is based on keystrokes. If this is correct then it is obvious why the name field is often difficult to successfully search. Given a name length (first and surname) of 15 characters, and an error rate of 99% (I haven’t seen the latest rate quote but I believe you previous quoted something in that area), would mean that 1 in every 7 or 8 names would have a typo. For a age field it would be in the region of one in every 50. That is before the natural skewing mentioned above. Interesting!
February 18th, 2009 at 8:43 am
@Chipp: I’m not going to get drawn into into a long discussion about this, but your assumption about keystrokes being related to the error rate quoted is totally incorrect. To recap:
The transcriptions are checked on a per piece basis.
The error rate is calculated on a per-field basis (i.e whole field accurate = accept, error in field = reject).
The accuracy rate on a per field basis is in excess of 98.5% for every piece released to date and will continue for each released subsequently.
Any pieces coming in under this rate are rejected and rekeyed.
February 21st, 2009 at 4:34 am
I raised this issue through the “correct route” of customer support and received no response.
Why, when I paid 30 credits for an original image (institution), did I have to pay an additional 10 credits for the transcript just to find out what the institution was? In this case, the image had *less* information than the transcript. In fact the search results (free) had more information than the image.
I understand this is a commercial service and I expect a few tricks to boost revenue. But when I pay the premium price I expect a premium product. If the transcript is really just a “finding aid”, why is it not included in the cost of the image?
February 23rd, 2009 at 10:13 am
What a pity that FMP is spoiling the ship for a ha’porth of tar.
1. The simplified definition of a transcript/transcription is a written copy of an original. Ergo, no fields or details on the original should be omitted. I recommend that FMP consult, say, The Shorter Oxford English Dictionary on Historical Principles for a fuller definition.
It is deceptive of FMP to represent that it is offering a transcription when in fact it is offering a partial transcription.
2. The comment that John makes above has been made several times in different ways, and I endorse it.
If a person purchases an image of the original for 30 credits, then purchase of the corresponding so-called transcription should incur no extra charge.
Likewise, if a person purchases a so-called transcription for 10 credits then the purchase of the corresponding image should cost a further 20 credits only, not a further 30.
I cannot understand for the life of me why FMP has dug its heels in on this issue, and has alienated its customers in the process.
I suggest that FMP reconsider its position immediately. It will have more satisfied customers, and hence more revenue, by doing so.
February 23rd, 2009 at 10:23 am
@Noel: We have taken the feedback on board (as well as a lot of other feedback expressing satisfaction with the existing pricing model) and I have explained why we have decided not to proceed with a change to a pricing model that has been extensively tested over several years with few complaints.
February 23rd, 2009 at 6:23 pm
“a lot of other feedback expressing satisfaction with the existing pricing model”
Oh, come on! you surely don’t expect us to believe that there are people who would rather pay an extra 10 credits. Because that is what your response implies.
Noel has summed up the situation very well.
February 25th, 2009 at 4:51 am
Ian, after today I promise not to debate this any further with you. But I must respond to your comment on my post above, and then I will leave the matter alone.
My comment was directed at the specific pricing for the 1911 census. Nothing more, nothing less.
I challenge you to produce “…a lot of other feedback expressing satisfaction with the existing pricing model” for the 1911 census. And, given that the 1911 census was released only in December 2008, it is impossible for you to claim that the existing pricing model for the 1911 census “… has been extensively tested over several years with few complaints.”
I have a subscription to FMP, with which I am well-satisfied. I believe that, by and large, you have done a good job with the 1911 census. When all the images for a particular household become available, then I do not think that the charge of 30 credits for a set of original images is too excessive. Nor do I think that 10 credits for a partial transcription is excessive.
What I do think is excessive is to charge more than 30 credits for “originals plus transcription” as explained in my original post. And I very much doubt that there are many family historians who would think otherwise. By charging 40 credits you are exploiting a monopoly position and double-dipping. Ultimately, this is ill-advised business practice.
It is not too late for FMP to rethink its position on this matter, and I urge you to do so.
February 25th, 2009 at 9:23 am
@Noel: we have over 10,000 completed surveys from the beta testing and large amounts of qualitative research from detailed user testing further back in the year. The vast majority pronounced themselves either somewhat satisfied or completely satisfied with the pricing model. We don’t do these things without testing them first and at some scale.
The pricing model of variations between the costs of transcripts and images and particularly no “refund” on the transcript cost has been tested extensively on various datasets on findmypast.com since we launched the 1861 census in 2005, and specifically on the BT27 Passenger Lists where the higher costs of full-colour, original-scan digitisation made this necessary. We have never had any significant comment or feedback on this matter.
As far as I am aware, no other major site offers this model (refund on transcript when viewing original image), but I am happy to be corrected.
If many other commercial sites are doing this and we are bucking the trend, we will happily accept that we are pursuing an “ill-advised business practice”, “double-dipping” and “exploiting a monopoly position”. If not, I would say that we are following an established online business model for family history data and making pricing both simple to understand and transparent.
I understand that people are always happier paying less (me included), but even if we did offer a refund mechanism (which would make pricing harder for users to understand) we would have to compensate by raising the number of credits for either the transcription or the original image to compensate (we still have to pay back a large investment in making the data available). So you would almost certainly end up paying the same. I do understand your arguments and am sympathetic to them, but we did not take the pricing decision lightly or blindly.
The availability of the 1911 census on a subscription basis on findmypast.com later in the year will make this argument largely redundant, but for the moment the position will remain unchanged. Apart from anything else, the administration of refunds to customers who have already paid would be a logistical nightmare which would distract from the rollout of the remaining counties and images.
I hope this clears things up, but also that the subscription will be a serious boon to “power users” although we appreciate it’s frustrating waiting for it to appear.
@Jo: I didn’t say that, you’re putting words into my mouth! Please see my comments to Noel about raising the credit prices to compensate for a refund on the transcription cost.
February 25th, 2009 at 1:41 pm
I personally think the 1911 is a load of s*** (moderated) - I f****** (moderated) hate it and the company who rip us off with it!
Moderator’s note: expletives removed, otherwise unedited
February 25th, 2009 at 7:35 pm
Wayne,
I am disgusted you have to use that language. A lot of us have said things good and bad about 1911 and I myself have said a few things but I would never use language like that. I am very surprised it has been allowed on the blog and I hope it gets removed.
The blog and its moderators have been very kind in letting us air our views, but I hope I speak for all of us and think you have gone to far.
Mark
February 27th, 2009 at 8:49 pm
Hi Mark..above
My complete sentiments, thanks for saying them.
I have nothing against Waynes views in fact I value them but the choice of childish language seems to me to be 100% out of order.
Thanks Ian for your moderating of this blog for us all.
Tony
March 1st, 2009 at 11:56 am
I totally agree with everything Noel has said. Also, if the pricing had been cheaper I would have found it better value for money and probably spent more!!
March 5th, 2009 at 11:07 am
Ian,
Ref my earlier comment on the lanuage that was used. I am not a prude and old enough to know people do resort to this sort of language. But one thing I do know is that alot of people do get offened by this kind of thing and righlty so. All I am asking is that entry be removed from this blog as I do not think it fits in here. I hope I speak for alot more who use this blog.
mark
March 5th, 2009 at 12:25 pm
@ Mark: I will remove the rude words, but leave the comment otherwise intact….
March 5th, 2009 at 1:02 pm
Ian,
Many thanks
Mark
November 30th, 2010 at 4:12 pm
I most certenly liked this angle that you have on the subject. I wasnt planning on this at the time I started searching for tips. Your ideas was totally easy to understand. Happy to find that there’s an person here that gets it exactly what its is talking about.
January 17th, 2012 at 1:18 pm
First Aid Course…
[...]This siste is worth a look, good quality first aid courses[...]…