The intent of these processes is to
Not all of the activities discussed here are required for every expedition. The intent is to also provide history, background for exceptional cases or variations that do come up. Sometimes the process is more creative than routine. We want you to have a foundation for those times.
The rationale is that some of these processes take a long time. Perhaps you don't want to tie up your laptop while waiting. Historically the processes which deleted result records and defragmented storage would run for worst cases of 9-14 hours. On current ODAs and versions of Oracle the longest processes we have take no more than 90 minutes to complete.
All the instructions and explicit details in the procedural notes expect that a copy of SQL Developer is configured for appropriate access via your DBA credentials--whether that be on your laptop (not recommended), or a copy of the tool installed on the DEV or BUILD hosts. It is your responsibility to be familiar when re-establishing that connectivity if it is broken. We frequently update tools like SQL Developer to keep pace with Oracle technology changes and security fixes.
Formerly these instructions were written so as to generate log files that you and your colleagues could inspect. We no longer do that in this process since Feb 2023 (Exp 398).
If you are
It is not always effective due to database security and models of atomicity, completeness, independence, and durability. Amount of parallelism changes how Oracle manages this information too.
For command-line tools--like sql, sqlplus, r
man:
if your password has spaces or symbol characters, you must quote the password, e.g.
> sql x@yz Password:"@ s3cr3t fr45e" |
The database consists of distinct schemas. Each schema serves different functions
End-of-expedition processing leaves data in LIMS, TRANSFER. That content may be removed. The beginning-of-expedition processing removes that content. The procedures to conduct the removal are owned by the TRANSFER schema.
Oracle treats the changes resulting from end-of-expedition processing like any other transactions. Due to the size of these transactions it is likely that the 100GiB reserved for archive logging will be consumed. When this occurs, the database blocks all activity until archive log space is freed. All attempts at new transactions will receive the message ORA-00257: archiver stuck
.
This method of releasing the archiver assumes there is actually some spare space on the system. Via a DBA account--to increase the amount of archive log space available:alter system set db_recovery_file_dest_size=1000G scope=memory;
The ODA systems carry plenty of disk. The above will get Oracle back up and running with enough breathing room for you to connect and do the following.
The HTTP-based Oracle Enterprise Manager (OEM) provides a page for detailed monitoring and management of archive logs. Pre-requisites for usage
(1)
Connect to OEM at https://oemjr.ship.iodp.tamu.edu:7802/em
Login with your OEM credential.
(2)
Select the menu Target > Databases. Click on the link corresponding to the database you wish to manage.
Login with your DBA privileged RDBMS credential.
(3)
To simply monitor archive log usage visit this page.
On the secondary menu select Administration > Storage > Archive Logs
Refresh the page as-needed to update the statistics.
To manage the archive logs visit this page
On the secondary menu select Availability > Backup & Recovery > Manage Current Backups
Operating system credential is required to send RMAN (recovery manager) commands.
TODO Provide more detail here as we get more practice.
Run this SQL as the DBA to verify info about archive logging status.
|
Expect output like this
|
or this
|
All scripts are run from procedures owned by the TRANSFER account. The typical data cleanup and preening session is shown above and repeated here
|
Most scripts take a single string parameter for expedition. The ASMAN scripts take a single string parameter indicating the catalog to be deleted--usually formed as 'EXP'+'theExpeditionNumber', e.g. 'EXP339'. Catalogs: DESCINFO and DROPS are kept per user request.
Run the scripts from SQL, SQL*Plus or from within a SQLDeveloper worksheet. SQL and SQL*Plus have the advantage of being light-weight, and provides facilities to spool all transactions that transpire for post-cleanup review and comparison. SQLDeveloper has the advantage of readily reviewing and verifying the scripts being invoked.
The scripts are designed to provide feedback, however the Oracle mechanisms are not real-time. To turn on useful feedback mechanisms, apply these commands from the SQL*PLUS command-line mileage will vary if tried in the context of SQLDeveloper:
|
To capture the content of a SQL*Plus session, do something like this:
|
The duration of the session is very dependent on the quantity of data collected. Be patient. It takes time to delete 14-20 million rows. Current takes about 30 minutes. Not too long ago took between 9 and 20 hours.
Special behavior for scripts is noted below. E.g. order of execution is important in some cases, some processes are fast, some are slow. Some should be double-checked.
Data cleanup is routine, but will have variances. The additional scenarios provided below attempt to clarify the variances we encounter. Apply the scenario or combination of scenarios that best fits current expedition requirements.
You will see a number of procedures in TRANSFER
that are not documented here. These processes have been in disuse for so long additional thought, testing, and documentation should be applied to them when they come up again. Specific questions? Ask around. Read the code.
The routine processes are described first. Then less common scenarios are described.
Irrevocably and quickly drop all content in the TRANSFER schema tables with "NEW" in their names. No parameter required.
|
This script removes the database records only. It is a separate step to clean out the file system for the catalog. You are responsible for the database cleanup. The MCS take care of the file system. Talk with each other.
|
Repeat box_asman_catalog_delete for as many catalogs as you desire to remove. Confirm removal by re-running the survey script. Commit when done. The DESCINFO and DROPS catalogs are to be preserved--specific request by OPS and DESC technicians.
|
This is a longer process, but no more than 30 minutes with current database size and hardware. 6-20 hours is thankfully a historical artifact. Progress of the script can still be checked by counting samples, tests, and particularly results.
|
Repeat box_lims_expedition_delete
for as many expeditions as required. The smaller the quantity of data, the faster it will go.
See detailed process for setting of expedition specific variables. ----
Recommend the various select count(*) variants as routine content checks before and after the removal step.
|
This procedure is slow. Data we want to keep is intermixed with data to be deleted. So we do it rows at a time. The database is a 24/7 tool--there's always some activity against it.
These statistics are obsolete and in need of update. Delete of 398 result table content took about 30 minutes.
This procedure should NOT be run during routine laboratory data collection. For selected tables, it turns off triggers, and reduces archive logging.The procedure MAY be run during periods where only data reporting is being done.
This is the most typical expedition cleanup scenario:
Previous expedition content is being preserved on ship due to continuation with a new science party. Previously curated and collected samples are for reference only. The "legacy" content should be locked down so that it is not unintentionally modified. There is no special handling for ASMAN content. You have to remember that this expedition is now legacy and remove it at the appropriate time.
The curator manages this using the request code manager application. No developer involvement is required.
Transfer of content has been on an ad hoc basis. In general the need should be flagged before the expedition and managed as needed. Common scenarios are noted here.
See the MAD container load - process. And additional related documentation on the shore wiki. The names and masses of containers are delivered in a spreadsheet (or text file).
Some expeditions are continuations or extensions of previous work. For convenience they may wish to have a local copy of previous samples and analytical content. In these cases, pre-expedition preparation is preferred. Various methods are available
Similar scenarios apply for carrying test data.
Some expeditions are continuations or extensions of previous work. For convenience, the science party may require a local copy of previous samples and analytical content. If small amounts of legacy material is brought out for re-sampling, re-analysis, it is easiest to just re-catalog the material in LIMS with Catwalk Sample and SampleMaster.
For re-measurement of ODP, DSDP material with current equipment, plan on bring out the sample catalog (site, hole, core, section, section half, whole rounds) migrated into LIMS from Janus some time ago.
For new participants and staff to experience and practice using integrated shipboard systems, it is helpful to have some data to play with. The same is required for development that continues between expeditions.
Records are accumulated against an expedition/project called 999. After awhile it is helpful to clean out tests and results. Be judicious and selective. Ask around: is anyone using this data? For example, sometimes end-user testing or development is in progress of which you may not have been aware. Once you have go-ahead, it is often sufficient to clean out tests and results. Unless specifically testing sample cataloging we prefer to avoid the effort of recreating a complete and representative set of samples. These clean out scripts should only be used in the scenario where samples are being preserved [curatorial tests and results], but science data [non-curatorial] tests and results are being removed.
These are encoded in the TRANSFER script UTIL_LIMS_TEST_RESULT_DELETE.
|