Data migration system from Version 3 to Version 4
This document describes the data migration process for the Pod application from version 3.8.x to version 4.0.x. The system is based on two main scripts:
- one to export data from Pod v3 to a JSON file,
- the other to import that JSON file into Pod v4.
Prerequisites
- A version of Pod in 3.8.x (3.8.1 to 3.8.4 at this time)
- A version of Pod in 4.0.x (currently 4.0.0-beta)
- Make sure you have access to the Pod 3.8.x database (MariaDB/MySQL or PostgreSQL).
- Make sure you have access to the Pod 4.0.x database (MariaDB/MySQL or PostgreSQL).
Exporting data from Pod v3
Export Description
This first script exports data from the Pod v3.8.x database into a JSON file. It supports both MariaDB/MySQL and PostgreSQL and adapts SQL queries accordingly.
Note: This script must be run from a Pod v3 server.
The latest version of this script export_data_from_v3_to_v4.py
is available here: https://github.com/EsupPortail/Esup-Pod/tree/main/pod/video/management/commands
You need to retrieve this script and place it in the pod/video/management/commands
directory with the correct permissions.
Export Key features
- Exports specified tables from the Pod v3 database to a JSON file.
- Supports both MariaDB/MySQL and PostgreSQL.
- Creates a directory to store exported data if it does not already exist.
- Provides detailed success and error messages.
Important notes before Export
- The JSON file will be generated at this location:
BASE_DIR/data_from_v3_to_v4/v3_exported_tables.json
.- Example:
/usr/local/django_projects/data_from_v3_to_v4/v3_exported_tables.json
.
- Example:
Check your custom/settings_local.py
to find the configured BASE_DIR
directory.
- This script can be run as many times as needed; the JSON file is regenerated with each execution.
Data Consolidation
Before attempting an export, it may be useful to ensure data consolidation.
A dedicated script, check_database_problems.py
, is available here: https://github.com/EsupPortail/Esup-Pod/tree/main/pod/video/management/commands
You need to retrieve this script and place it in the pod/video/management/commands
directory with the correct permissions.
python manage.py check_database_problems
The script will detect and fix inconsistencies.
Export
Run the script from a Pod v3 server using the following command:
python manage.py export_data_from_v3_to_v4
Importing data into Pod v4
Import Description
This script imports data from the previously generated JSON file into a Pod v4 database. It supports MariaDB/MySQL and PostgreSQL, reads data from the specified JSON file, processes it, and inserts it into the appropriate tables in the Pod v4 database.
Import Key features
- Imports a JSON file generated with specified tables from the Pod v3 database.
- Supports MariaDB/MySQL and PostgreSQL.
- Creates a directory to store exported data if it does not already exist (usually unnecessary).
- Provides detailed success and error messages.
- Supports tags management for videos and recorders via the Tagulous library.
- Can run a Bash command to create the database and initialize data.
- Supports secure error handling and a dry-run mode.
Important notes before import
- The JSON file must be located at
BASE_DIR/data_from_v3_to_v4/v3_exported_tables.json
.- Example:
/usr/local/django_projects/data_from_v3_to_v4/v3_exported_tables.json
.
- Example:
Check your custom/settings_local.py
to find the configured BASE_DIR
directory.
-
Set
DEBUG = False
insettings_local.py
to avoid debug/warning/info messages. -
Can be used with MariaDB/MySQL or PostgreSQL databases.
-
If you encounter a “Too many connections” error, you can increase the
time_sleep
variable. The process will take longer but will complete without error. -
This script can be run multiple times; data is deleted before re-insertion.
-
Depending on your data, this script may take a long time to complete. Typically, importing the
video_viewcount
table is slow. Additionally, since the tags management library has changed between v3 and v4, specific processing is required and takes time to avoid “Too many connections” errors. -
After import, do not forget to make the Pod v3
MEDIA_ROOT
accessible to Pod v4 servers. -
After import, do not forget to re-index all videos for Elasticsearch with:
python manage.py index_videos --all
⚠️ This script may take more or less time depending on the volume of videos to be indexed (expect to process around 170 videos per minute).
Import
Run the script using the management command:
python manage.py import_data_from_v3_to_v4
Arguments
--dry
: Simulates what will be done (default=False).--createDB
: Runs Bash commands to create tables in the database and add initial data (seemake createDB
). The database must be empty (default=False).--onlytags
: Processes only the tags (default=False). Useful if you encounter the ‘Too many connections’ error when processing tags.
Examples
Dry run:
python manage.py import_data_from_v3_to_v4 --dry
If the database is completely empty (no tables), you can run this command which will perform a make createDB
before importing data:
python manage.py import_data_from_v3_to_v4 --createDB
If you encountered a “Too many connections” error while importing tags, feel free to increase the time_sleep
value (e.g., 0.4 or 0.5 seconds) and re-run the process, but only for tags:
python manage.py import_data_from_v3_to_v4 --onlytags
Of course, it is possible to mix the different arguments.
⚠️ This script may take more or less time depending on the volume of videos and associated tags (expect to process around 500 videos per minute).
By following these instructions, you should be able to successfully migrate your Pod 3.8.x database to Pod 4.0.x.