Archive for the ‘Data’ Category

Name variants now switched on

Tuesday, January 27th, 2009

We are happy to tell you that we have now applied changes to the data on 1911 Census to allow you to search for variants for both first names and last names.

The variants search is essentially a large thesaurus that identifies common variants, mis-spellings and alternative spellings for many of the names within the census. This should be of great help in tracking down elusive ancestors that you have so far not been able to identify. 

Click on the “Show advanced fields” button to make use of this search feature.

Coming next: allow a wildcard as the first character, increase range of years available (currently set at a maximum of +/- 2 years).

Enjoy!

Map - counties currently available

Friday, January 23rd, 2009

We thought this would help you visualise where we are in terms of county rollout.

map - 1911 census counties currently live

map - 1911 census counties currently live

The counties live represent 83% of the total population in the 1911 Census.

ERRATUM: there is a blue, unnamed chunk of land lying between Denbighshire, Cheshire and Shropshire which is in fact part of Flintshire and therefore not live yet. Islands around the British Isles are also not shown. Thanks to eagle eyed-readers who spotted this - we will get the map updated.

If you have not already seen it, we have already posted an article on the order of scanning and rollout of the counties, click below to read it:

http://blog.1911census.co.uk/2008/12/the-order-of-scanning-and-our-unintentional-northsouth-divide/

This map is based on an original county map generously provided by the Association of British Counties -  http://www.abcounties.co.uk

Transcription process and accuracy levels

Wednesday, January 21st, 2009

There have been a few questions on transcription accuracy and our policy towards certain aspects of transcribing the records. We hope this post clears up a few questions!

The transcription accuracy of the 1911census.co.uk website at launch is in excess of 98.5% according to recent tests - this threshold is set as a requirement by the National Archives.

Transcribing the census is a massive exercise - every single digitised document has to be read and transcribed and this process results in over 7 billion keystrokes over the course of the project. Naturally in this volume of keystrokes, more than a few errors will be made.

However, during the transcription process, we do apply a number of processes (developed during our many years’ experience of digitising censuses and other historical documents) to correct the most obvious errors and keep inaccuracy to a minimum.

The 1911census in particular poses specific problems - because the household summaries are the core documents rather than enumerators’ books, the variety of the handwriting itself is significantly wider - in fact there are 8 million different hands writing returns, making interpretation of the handwriting a much more challenging task!

Now some good news - the 98.5% accuracy at launch will improve over time.

The first way that it will be improved is by users of 1911census.co.uk reporting errors to us. Each report is reviewed by hand by the transcription team and if the change is approved, the change is incorporated into the search results, usually within a month (when the next data upload is made to the website).

Our policy is to accept changes only if they match what is on the original page (i.e the household form). So if your ancestor made spelling mistakes on the original page, they will be carried through into the transcript. This is actually more common than you might think, so please be sure to check the original page before you assume that there is an error, rather than an accurate transcription of the original document.

The second way that we improve the quality of the transcription over time is by applying ‘data standardisation’ processes. This is basically a set of rules we develop over time as we identify errors and apply to the data. A basic standardisation that we apply for example is converting “Geo” to “George” and listing records from Kent, Surrey and Middlesex as “London” if they fall within the metropolitan London area. We are developing and applying more data standardisations over time to eliminate more of the current transcription errors and to make searching easier, but some of these processes are much easier to apply once the data is complete.

All of our transcriptions undergo thorough batch sampling, by the transcription house, by The National Archives and by our in-house Quality Control team. Any batch failing to meet the required level of accuracy is rejected and rekeyed.

One way of reducing transcription errors is by ‘double-keying’ every entry - this basically means getting the transcriptions done twice (by different people) and then comparing the two versions and eliminating differences by hand. However, the cost of doing this naturally doubles the transcription cost, would not improve the accuracy rate by a hugely significant degree (you can never reach 100%), and the costs would have had to have been passed on to the public – resulting in higher prices for the census service.

We could also have taken the route of transcribing fewer fields – just a name index, like the old pre-digital booklets – but feel that this would have resulted in fewer people being able to find their ancestors as it would narrow the number of fields you can search on. It would also have made the transcription much less useful for academic study, which is one of the uses to which 1911 census will be put when it is completed.

It is important to remember that the transcription is designed as a finding aid for the original documents, which should be viewed as the “source of truth”; happily most users are able to find their ancestors despite the inevitable errors that creep in.

We have also provided very flexible search options (using wildcards, for example), which, with some lateral thinking, can also help you track down those who do not appear on the first search. The search options had to be constrained at launch to allow for the volumes of people searching, but we have been unlocking these features as the week has worn on, and there is more to come (see other blog posts).

More images available within the month at no extra cost

Thursday, January 15th, 2009

So far, we have only made a single image available on 1911census.co.uk - the principal original page of the RG14 Household (or Institution) Summary.

Within the next month, we will make the following images available at NO EXTRA COST.

If you have already bought the original page of the household original page, you will be able to view any associated images for free, simply by returning to the record you have paid for via the “My Records” area on the website. Any new images that you buy will have all the associated images available at the flat cost of 30 credits for the lot.

Not every search result will have all of the following images available but many households will.

Extra RG14 Household (or Institution) Schedule images:

  • The address panel from the back of the schedule, showing the address as written by your ancestors and the registration district and subdistrict
  • The front page of the volume in which your schedule was stored, giving more detailed information on parishes and districts
Extra RG78 Enumerator Summary Book images
  • The Front page of the volume in which the Enumerator’s Summary sheet was stored
  • Enumerator’s Summary original page - this not only shows the names of heads of households and how many people occupied the houses (showing you the neighbours), but also lists other buildings, whether houses or not.
  • Population statistics for the area
  • a description of the Enumerator’s walk
  • (in some cases) a map of the Enumerator’s walk
The Enumerator’s summary original page in particular is a real treasure trove of local information and can also help you unlock mysteries such as family living nearby.
Looking at one from my great-grandfather’s house in Hastings old town, as well as private houses, there are listed 2 pubs, many stables, a corporation store, the East Hill lift (!), rope huts, a mortuary, the Fisherman’s Church, what would nowadays be called a dump, and many more buildings. I can also see three families that remain family friends 98 years later living in the same row of houses!
We will update this blog once we have a firm date for the arrival of these images on the site.
UPDATE: June 18th - these images have now been added at the same time as the completion of the census, after some significant work in the past few months to get them ready for release. To view the extra images, you may need to load a fresh version of the page, especially if you are looking at an original household page that you have viewed previously. To do this, hit the CTRL and F5 keys on your computer together to reload the page, and new buttons should appear allowing you access to the extra images. Please be aware that some household schedules do NOT have Enumerators Summary Books as a small number did not survive.

% of the population covered by existing records

Thursday, January 15th, 2009

For those of you wanting to know the % of the population covered by the records available at launch, the figures are:

  • 88% of England
  • 83% of total population included in census (Including Wales, Islands, Military etc)

Some short videos about the 1911 census

Wednesday, January 14th, 2009

Here’s a few videos showing you more about how we prepared the census to go online.

Enjoy!

Counties available on the site at launch

Monday, January 12th, 2009

Please find below a full list of the 36 counties available at launch. Please see our separate post on the rollout dates of the remaining counties.

If you have not already seen it, we have already posted an article on the order of scanning and rollout of the counties, click below to read it:

http://blog.1911census.co.uk/2008/12/the-order-of-scanning-and-our-unintentional-northsouth-divide/

Counties available at launch:

 

  1. Bedfordshire
  2. Berkshire
  3. Buckinghamshire
  4. Cambridgeshire
  5. Cheshire
  6. Cornwall
  7. Derbyshire
  8. Devonshire
  9. Dorsetshire
  10. Essex
  11. Gloucestershire
  12. Hampshire
  13. Herefordshire
  14. Hertfordshire
  15. Huntingdonshire
  16. Kent
  17. Lancashire
  18. Leicestershire
  19. Lincolnshire
  20. London
  21. Middlesex
  22. Norfolk
  23. Northamptonshire
  24. Nottinghamshire
  25. Oxfordshire
  26. Rutlandshire
  27. Shropshire
  28. Somersetshire
  29. Staffordshire
  30. Suffolk
  31. Surrey
  32. Sussex
  33. Warwickshire
  34. Wiltshire
  35. Worcestershire
  36. Yorkshire West Riding

 Counties not available for launch:

 England:

 

  • Durham
  • Cumberland
  • Northumberland
  • Westmorland
  • Yorkshire – East Riding and North Riding

 Wales:

 

  • Anglesey
  • Brecknockshire
  • Carnarvonshire
  • Cardiganshire
  • Carmarthenshire
  • Denbighshire
  • Flintshire
  • Glamorgan
  • Merionethshire
  • Montgomeryshire
  • Monmouthshire
  • Pembrokeshire
  • Radnorshire

Other:

 

  • Isle of Man
  • Channel Islands
  • Royal Navy
  • Military Establishments

2 pieces (temporarily) missing from Yorkshire West Riding

Monday, January 12th, 2009

Although the vast majority of the West Riding of Yorkshire is complete, there are 2 pieces which are not yet online as they are still being examined by the Conservation Team to prepare them for scanning. The 2 pieces are from Knaresborough and Doncaster respectively and each contain approximately 1,500 individuals.

 We will add them in as soon as possible and update you when they are available.

Rollout of the remaining counties

Monday, January 12th, 2009

Although there are already 36 complete counties at launch (35 if you don’t count the West Riding of Yorkshire as a county in its own right), there are more to come before the census is complete.

The counties we have released were the most heavily populated in 1911, so in terms of the percentage of the 1911 population 83% of the people included in the Census (and 88% of those in England) are available on the site at launch.

To understand the order we have been scanning the counties (and therefore the order they will become available) please read our previous post on scanning order: http://blog.1911census.co.uk/2008/12/the-order-of-scanning-and-our-unintentional-northsouth-divide/

Because of the way the pieces are ordered, those of you looking for relatives in the north of England and in Wales will have to be patient whilst we scan and transcribe the final pieces of the census. The good news is that we anticipate having the entire census complete by summer. The less good news is that we cannot give you a concrete date for when each individual county will become available (and neither can our Customer Support team!).

There’s a very good reason for this: scanning and preparing the census for publication online is a long process which is not 100% predictable. Although we have been making excellent progress to date (hence the early release of the counties so far) given that they were originally due to complete in 2011 , there are always a few unpredictable things that can delay the process. These include having to stop scanning to hoover dust out of the scanning machines, having to send pieces for conservation before they can be scanned or simple delays in moving the huge quantities of data generated from one place to another. The data for the 1911 census will take up over half a petabyte of storage - that’s the equivalent of over 50 million copies of the Yellow Pages and quite challenging to move around!

Rest assured that as soon as we have more counties to release, we will let you know immediately by email (if you are signed up to receive updates) and on this blog.

BETA: Beta site closing - December 30th last day

Monday, December 29th, 2008

We are now reaching the end of the beta period for the 1911census.co.uk site. Tomorrow will be the last day that the site will be available for use. Unfortunately no more counties will be added at this point so we hope that you have enjoyed using what has been available over Christmas.

The full site launch in 2009 will include many more additional counties. Thank you for the huge volume of feedback you have given us and for your many comments - we will be analysing all of these in depth.

Our best wishes for a happy 2009 and continued success in researching your family history.

The 1911census.co.uk team