Skip to content

Add a data file updater #207

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Ayuto opened this issue Jun 19, 2017 · 6 comments
Closed

Add a data file updater #207

Ayuto opened this issue Jun 19, 2017 · 6 comments

Comments

@Ayuto
Copy link
Member

Ayuto commented Jun 19, 2017

The updater should update all Source.Python data files. There are two options to update the files:

  • automatically, when Source.Python is loaded (add a setting to disable the auto-update)
  • manually, by using a server command
@jordanbriere
Copy link
Contributor

I think updating on load is better than a command that do it on the fly. A command that update the data files, also means most of the modules would need to be reloaded at run-time (or rewritten to re-read the files to ensure they are not losing data that might be carried such as containers of live objects, etc.). Doing it on load, before any setup_functions are called (except for the config) seems to be the easiest way as all the modules could remains as is.

@Ayuto
Copy link
Member Author

Ayuto commented Jun 20, 2017

Yeah, I don't want to implement the reloading part, so the command would simply require a restart of the server. This option could be useful for development servers, which are restarted several times.

@Ayuto
Copy link
Member Author

Ayuto commented Jun 20, 2017

What do you think about this one?

import hashlib

from zipfile import ZipFile
from urllib.request import urlopen

from paths import DATA_PATH
from paths import SP_DATA_PATH

DATA_ZIP_FILE = DATA_PATH / 'source-python-data.zip'
CHECKSUM_URL = 'http://data.sourcepython.com/checksum.txt'
DATA_URL = 'http://data.sourcepython.com/source-python-data.zip'

def get_latest_data_checksum(timeout=3):
    """Return the MD5 checksum of the latest data from the build server.

    :param float timeout:
        Number of seconds that need to pass until a timeout occurs.
    :rtype: str
    """
    with urlopen(CHECKSUM_URL, timeout=timeout) as url:
        return url.read().decode()

def download_latest_data(timeout=3):
    """Download the latest data from the build server.

    :param float timeout:
        Number of seconds that need to pass until a timeout occurs.
    """
    with urlopen(DATA_URL, timeout=timeout) as url:
        data = url.read()

    with DATA_ZIP_FILE.open('wb') as f:
        f.write(data)

def unpack_data():
    """Unpack ``source-python-data.zip``."""
    with ZipFile(DATA_ZIP_FILE) as zip:
        zip.extractall(DATA_PATH)

def update_data(timeout=3):
    """Download and unpack the latest data from the build server.

    :param float timeout:
        Number of seconds that need to pass until a timeout occurs.
    """
    download_latest_data(timeout)
    if SP_DATA_PATH.isdir():
        SP_DATA_PATH.rmtree()

    unpack_data()

def is_new_data_available(timeout=3):
    """Return ``True`` if new data is available.

    :param float timeout:
        Number of seconds that need to pass until a timeout occurs.
    :rtype: bool
    """
    if not DATA_ZIP_FILE.isfile():
        return True

    return DATA_ZIP_FILE.read_hexhash('md5') != get_latest_data_checksum(timeout)


# Test - Call this when the Python part of SP is loaded
# TODO:
# - Handle timeouts
# - Add setting to disable auto-update
# - Add command to manually update the data
if is_new_data_available():
    update_data()

To get this working I have created a new job on the build server, which gets triggered everytime a commit to the master branch has been made. The job creates a zip file of ../addons/source-python/data/source-python and an MD5 checksum of the zip file (source-python.zip and checksum.txt).

If a server is started, this snippet checks if source-python.zip exists locally. If it doesn't, the file gets downloaded and extracted. If it exists, the MD5 checksum is created using the local source-python.zip and compared to the checksum stored on the build server. If they differ, the data gets updated using source-python.zip from the build server.

Disadvantages:

  • Servers are updating their data even if we change data of a different game
  • Changes to the local data files are not recognized as source-python.zip is being used to determine changes
  • Since we don't have major releases, this could lead to exceptions e.g. if we remove a signature and the server is running an old SP version that requires this signature

Benefits:

  • Easy to implement
  • Easy to maintain (just update the data and commit it to Github)
  • Allows others to update the data via PRs (we just need to merge it and it's done)
  • Fast (servers just need to download 100KB)
  • Stable

@jordanbriere
Copy link
Contributor

Like it, good job!

Changes to the local data files are not recognized as source-python.zip is being used to determine changes

Users shouldn't edit those files to begins with. Though we could always implement a similar system we have for the translations; a _server suffixed file dedicated to host server specific changes but I really don't think this is necessary. If they really want to makes changes and keep them as is, they could always simply disable auto-updating of the data.

Since we don't have major releases, this could lead to exceptions e.g. if we remove a signature and the server is running an old SP version that requires this signature

Could always internally stop the updating if a newer version of SP is available to avoid such cases. This could also be controlled by a second setting that "force" update or not. Though I don't really think we should bother, the chance any signatures get removed are very low.

@Ayuto
Copy link
Member Author

Ayuto commented Jun 21, 2017

Thanks!

Could always internally stop the updating if a newer version of SP is available to avoid such cases.

That wouldn't work, because a new Source.Python version is created as soon as we commit something. But we could create a backup of the current data before updating it. So, if that case should ever occur, you can disable the auto-update and restore the backup. Though, in that case _server files would be great, because then you can still have auto-update enabled and fix the exception by providing the signature on your own.

@Ayuto
Copy link
Member Author

Ayuto commented Jun 21, 2017

Both files are now moved to a new server for better availability and shorter URLs (data.sourcepython.com). I have also started a new branch and commited the updater module:
https://github.com/Source-Python-Dev-Team/Source.Python/tree/data_updater

I also came to the conclusion that adding a command doesn't make much sense. If an important signature is out-dated, SP won't fully load. So, I guess we will just stick with the auto-update option.

The problem we have right now is that data is already read/used when calling setup_core_settings, but logging and settings are required to log messages and to check whether auto-update is enabled/disabled. This needs to be fixed first.

@Ayuto Ayuto mentioned this issue Jul 4, 2017
@Ayuto Ayuto closed this as completed Jul 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants