When and why should I use apt-get update

aptdocker

General question:

Could some explain what the command apt-get update does and when I really should use it?


Remarks

Please give a detailed answer. Not just a copy of the man page, unless your version is really detailed (I put one definition from man page below).

apt-get update: Used to re-synchronize the package index files from their sources. The indexes of available packages are fetched from the location(s) specified in /etc/apt/sources.list(5). An update should always be performed before an upgrade or dist-upgrade.


Sub-questions:

  • Where is stored the package index? On a database? On a file?
  • What happens if I do apt-get install without updating the cache? Is there a chance that the remote package would not exist anymore and that the link would be broken?
  • Is there some agreed politic about deb repositories? For example, should a repository only contains the last version of a package, or on the contrary should it contains all versions available for a specific distribution release?

Context

I ask my question because I am studying the Docker framework. One of its feature is the Dockerfile, which allows you to build a sort of OS image by executing some instruction from this file.
One property of this image is that it should always be the same, whatever the context is (time of build, etc).

I'm afraid that if I launch apt-get update command at different time, the result would be different and so my images would be different.

Best Answer

apt-get update downloads the list of available packages.

The list of packages can change over time. New packages are added, and old packages are removed. Thus if you have a really old cache, and you try to do an apt-get install, it might try to download a package which no longer exists.
How long an old package is kept in a repository is up to the repo maintainer (your distribution). As such, if you're using something like docker, where the cache might be very out of date, you should always run apt-get update before installing any packages.

The reason for removing and adding packages is mostly bug fixes & security updates. Though if you're using 3rd party repos like PPA, anything goes.

When using something like docker for containerization in a corporate environment, you should build the container once, and then move that container through your various release environments (development, staging, production), and not rebuild the container each time. This will make sure you don't get a different container that hasn't been tested.

To answer your question of where the cache files live, /var/lib/apt/lists.

Related Question