How do I know when a web page was last modified?


Info-pack reader ‘Dave’ writes:

“Dear Denis,

I have a question to which I cannot find a clear answer. Can you tell me: is it possible to display a website / webpage and determine when it was last updated? “

My answer:

The simple answer is, there’s no easy way to tell when a page was last updated – which is probably why you’re having a hard time figuring out the answer.

Information regarding a page’s ‘last modified date’ is usually reserved for website owners (via direct access to website files / databases with their associated timestamps). However, there are ways to get an approximation of when a page was last modified, even if you are not the website owner. I will discuss a few methods below.

View the last modified date using a web browser plug-in or add-on

There are browser plugins available that will provide the last modified date of a webpage, but this method does not always work. This is done by parsing the “last modified” web server headers command, but not all web servers will display this information when you visit a page. At the time of writing this article, many browser plugins that I have researched (at least, for Firefox) that use this method appear to be flawed – probably for the reasons I just mentioned. Everyone is welcome to step in with a plugin they use – if it’s reliable.

Using RSS feeds to check the publication dates of articles

If a website offers an RSS feed (as many modern websites do), the publication date is contained in the XML file. Most sites have an index.xml containing such information; you would need to view the file manually in your web browser to determine the dates. Example:

http://www.infopackets.com/index.xml

Use Google Cache to check the last crawl of a page

Google Cache is a great way to view a web page when it was last crawled by Google. You can access Google Cache at any time; however, there is no way to specify a cache date range, nor to know when the page was last crawled. In other words, it’s a craps shot as to the date and time of the page in Google’s cache. To check Google’s cache, you need to enter the following into your web browser:

http://webcache.googleusercontent.com/search?q=cache:http://www.infopackets.com

Note that you will need to replace “http://www.infopackets.com” at the end of the query with the website / web page you are looking for.

Using “The Way Back Machine” to Visit Websites in the Past

You can also try web archives through ‘The Way Back Machine’. This is similar to Google’s cache, but provides you with an interface to specify crawl dates. Note that you cannot specify a user-defined date for a web page – only those that were crawled on specific dates by Way Back Machine’s web crawler.

http://archive.org/web/

For website owners: specify the date of the last modification on the page

If you have a website, it is possible to view the last modified date of a web page. There are different ways to do this, depending on the platform you are using. For example: if you are running the Apache web server and serving static files with Server Side Include (SSI), you can view the last modified date of a static HTML page using the
LAST_MODIFIED variable.

Hope this helps answer your question.

About the Author: Dennis Faas is the owner and operator of Infopackets.com. With over 30 years of IT experience, Dennis’ areas of expertise are wide ranging and include computer hardware, Microsoft Windows, Linux, network administration and virtualization. Dennis has a BS in Computer Science (1999) and is the author of 6 books on the topics of MS Windows and PC Security. If you like the advice you received on this page, please vote for / Like this page and share it with your friends. For technical support requests, Dennis can be contacted via the live online chat on this site using the Zopim Chat service (currently located at the bottom left of the screen); optionally, you can contact Dennis through the contact form on the website.


Comments are closed.