TWINS PROJECT FINAL REPORT

 

 

 

 

By Aaron Lowe

 

 

 

 

 

 

 

 

 

5/3/2003

 

WSU Tri-Cities

CptS 451

 


TABLE OF CONTENTS

 

I.          PROJECT GOAL. 1

II.         TWINS BACKGROUND.. 1

Best Basis Inventory Subject Area General Description. 1

Measurements Subject Area General Description. 1

Safety Subject Area General Description. 2

Sample Analysis Subject Area General Description. 2

Vapor Subject Area General Description. 3

III.       INITIAL DESIGN.. 4

IV.       IMPLEMENTED DESIGN.. 5

V.        IMPORT PROCESS. 6

Creation of Temporary Tables. 6

Importation of Data into Temporary Tables. 7

Creation of Production Tables. 8

Filling the Production Tables. 8

VI.       REPORT RECREATION PROCESS. 10

VII.      ISSUES ENCOUNTERED.. 10

VIII.     PROPOSED FUTURE CHANGES. 11

Appendix A (Temp Table Creation Script) 12

Appendix B (Production Table Creation Script) 13

Appendix C (Data Transformation Script) 16

Appendix D (View Creation Script) 18


I.                      PROJECT GOAL

 

Five reports have been created using the Tank Waste Information System (TWINS).  The goal of this project is to design a normalized relational database to store the data from the five reports and provide a way to then recreate the reports from the database.  The first task is importing the data into the database.  This will involve the creation of tables and the transformation of the data from the report format to the database schema.  Next the reports will be recreated from the database using views.  The project will be deemed successful when the reports are accurately recreated from a sufficiently normalized database.

 

II.                      TWINS BACKGROUND

(extracted from TWINS data user guide)

Best Basis Inventory Subject Area General Description

 

The Best Basis Inventory Subject Area contains best-basis inventory estimates for chemical and radionuclide components in the 177 Hanford Site underground storage tanks. Inventories are presented on both a Tank-by-tank and global (total) basis. The tank-by-tank best-basis waste inventories include 25 chemical and 46 radionuclide components for each of the 177 double- and single-shell underground waste storage tanks. The global waste inventories include five chemicals in addition to the chemicals and radionuclides reported in the tank-by-tank inventories. The best-basis global inventories are independent estimates of the total amount of a chemical or radionuclide component in all tanks. The chemical analyte list selected represents 99 wt% of the tank contents, and the radionuclides represent over 99% of the activity. Information used to establish global inventories originated from key historical records (e.g., essential material purchase records), from various chemical flow-sheets used in reprocessing of irradiated Hanford Site reactor fuels, and from calculations of radionuclide isotope generation and decay. Tank-by-tank inventories are most often based on sample analysis results. Best-basis inventories now serve as waste composition data for Tank Waste Remediation System process flow-sheet modeling work, safety analyses, risk assessment, and waste retrieval, treatment, and disposal system design.

 

Measurements Subject Area General Description

 

The Measurements Subject Area contains data from the Surveillance Analysis Computer System (SACS) located in the 200 Area at Hanford.

Measurement types are:

 

#Surface Level

#Temperature (multiple sensor, thermocouple tree)

#Temperature (single sensor)

#Interstitial Liquid Level

 

These data are collected by a variety of means, including manual collection by operators in the field and automatic collection by the TMACS computer system which is a primary tank farm operations monitoring system. Data from TMACS is periodically transferred to SACS. The SACS data available through the Measurements Subject Area of TWINS2 is generally daily averages. A data "Quality Indicator" is always set by SACS to document the data quality.

 

Safety Subject Area General Description

 

The 177 underground waste storage tanks at Hanford, with a total capacity of 55 million gallons, create safety concerns that the Project Hanford Management Contractor (PHMC) is resolving. In FY 1997, projects are examining organics and flammable gas safety issues.

 

The Organics Tank Safety Project is investigating the hazards associated with the storage of organic complexant salts in the underground single-shell storage tanks. Evaluation of the double-shell tanks us not included in this investigation because these tanks contain appreciable amounts of aqueous waste and cannot support organic-complexant combustion under current storage conditions.

 

The Flammable Gas Safety Program investigates the generation, retention and release of flammable gas (ie. hydrogen, ammonia) in Hanford single-shell and double-shell underground waste tanks.

 

Sample Analysis Subject Area General Description

 

The Sample Analysis Subject Area contains data obtained from laboratory "data package" reports on a per-tank basis. Information includes Sampling Event data, Sample data, Analysis Method data, and Analytical Results data. Analytical Results include primary/duplicate/triplicate and other results, plus "QA results"; that is, results of tests that were performed for the purpose of assessing the performance of or calibrating the laboratory instrument. Sample Analysis data produced by the TWRS Program since about May 1989, is contained in the tables named, "Tank Results (hide QA data)", and , "Tank Results (show QA data)". In addition, previously collected laboratory data gleaned from letter reports and other sources is stored in the table named, "Historical Analytical Data". This constitutes all of the available Sample Analysis data obtained for the purpose of tank waste characterization for the Hanford high-level waste tanks.

 

Analytical Results are generally reported as concentrations - that is, the concentration of a constituent that was found by analyzing a sample using a specified testing procedure. The content of the Hanford tanks is often extremely inhomogeneous, and so the concentration of a given constituent as reported in the Analytical Results can vary greatly from sample to sample reflecting different areas or layers within the tank, or sample in-homogeneity.

In addition, a given constituent may be analyzed for by more than one method, yielding different results. Interpretation of these results requires chemical and statistical expertise.

 

Tank transfer data is also included in the Sample Analysis Subject Area even though it is not laboratory-generated analytical data. Tank Transfer data is data that describes transfer events where material is transferred to or from waste tanks, thereby changing their contents. This information, including a description of the waste type, is contained in the "Tank Transfers" table. Also, a "Transfer Since Sample" field is set in all tables containing analytical results whenever the sample date of the analytical result is earlier than the Tank Transfer date.

 

Vapor Subject Area General Description

 

Tank headspace characterization data and certain process test data obtained after 1992 are available in the Vapor Subject Area. Sample collection data (e.g., date, time, sampling method, tank riser, etc.) and sample analysis data (e.g., analytical laboratory, analysis date, chemical names, reported value, units, reference temperature, concentration in mg/m3 at 25C, etc.), averaged analytical reported values, and percents of the lower flammability limit (LFL) for selected analytes are included. Headspace flammability as total percent LFL is calculated by summing the percents of LFL for ammonia, carbon monoxide, hydrogen, methane, and total non-methane hydrocarbons.

 

Headspace characterization data are typically based on a single set of samples collected during a single sampling event, though some tanks have been sampled more than once. Most tank headspace samples have been collected in either SUMMA® canisters or sorbent traps, using the Vapor Sampling System (VSS), the In Situ Vapor Sampling (ISVS), or the In Situ Sampling (ISS) methods. Samples were collected from the central regions of the tank headspaces, and are thought to be representative of the headspace at the time of sampling. Samples were analyzed for inorganic analytes (NH3, NO2, NO, H2O), permanent gases (CO2, CO, CH4, H2, N2O), total non-methane hydrocarbons, and individual volatile and semi volatile organic compounds by analytical laboratories using prescribed procedures.

 

In addition to results from headspace characterization events, permanent gas concentrations from analyses of Standard Hydrogen Monitoring System

 (SHMS) Verification Grab Samples are provided. Also, limited data from industrial hygiene field monitoring of the tank headspaces, including combustible gas meter readings, are provided. Also, results from headspace and ventilation system samples collected during process testing are also included in the Test Data tables.


III.                      INITIAL DESIGN

 

 


IV.                      IMPLEMENTED DESIGN

 

 


V.                      IMPORT PROCESS

 

The import process consisted of the following four steps:

 

1.      Creating temporary tables to hold report data.

2.      Importing the report data into the temporary tables.

3.      Creating the production tables.

4.      Transforming the data and importing it into the production tables.

Creation of Temporary Tables

An SQL script was used to create the following tables (see Appendix A):


Importation of Data into Temporary Tables

 

The reports had to be in a tab delimited text file before importation.  All of the files, except for two, where originally in the correct format.  The BBI_Summary and BBI_Tank_Inventory reports had to be converted from excel format to the delimited text format.  This was done by opening the files with excel and saving them as text files.  The inventory columns displayed decimal numbers in scientific notation, so these had to be reformatted to text, before conversion, in order to retain the underlying precision.

 

The following calls to BCP were made from a command prompt:

 

Bcp aLOWE..CONST_GROUP in SAMPLE_CONST_GROUPS.txt /F 2 /c /U alowe_db /P dog49tank /S WSUTCSQL

 

Bcp aLOWE..COMPOUND_GROUP in VAPOR_COMP_GROUPS.txt /F 2 /c /U alowe_db /P dog49tank /S WSUTCSQL

 

Bcp aLOWE..WT_SUMMARY in MEAS_WASTE_TANK_SUMM.txt /F 2 /c /U alowe_db /P dog49tank /S WSUTCSQL

 

Bcp aLOWE..BB_SUMMARY in BBI_SUMMARY.txt /F 2 /c /U alowe_db /P dog49tank /S WSUTCSQL

 

Bcp aLOWE..BBT_INVEN in BBI_TANK_INVEN.txt /F 2 /c /U alowe_db /P dog49tank /S WSUTCSQL

 

 

The “/F 2” parameter tells the program to start on the second line.  This was used because the first line in the reports contained the field names.  The “/c” parameter informs the program to use the Character format when importing the text files.  The “/U” and “/P” parameters where used to provide the username and password for the SQL Server instance, which was designated with the “/S” parameter.  After the import process, it was confirmed that the temporary tables contained the same number of  records as the original reports and that none of the none of the data had been distorted or truncated.

 


Creation of Production Tables

An SQL script was used to create the following tables (see Appendix B):

Filling the Production Tables

The final step in the import process was transferring the data from the temporary tables to the production tables.  This required a string of INSERT and UPDATE queries that transformed and inserted the data (see Appendix C).

 

The Constituent table was loaded using three INSERT queries and one UPDATE query.  The first INSERT query loaded data from the Compound_Group table into the Constituent table and set the BBI_Summ_Flag to “no.”  The next query inserted the records from the Const_Group table where the Con_Name did not already exist in the Constituent table and set BBI_Summ_Flag to “no.”  This query attempted to reduce the overlapping records that might exist in the Compound_Group and Const_Group reports.  The next query inserted Con_Names from the BBI_Inventory.  This was done to make sure all constituents were loaded into the Constituent Table, even if they only existed in the BBI_Inventory.  The final query updated the Constituent table by setting the BBI_Summ_Flag to “yes” where the Con_Name was in the BBI_Inventory table.

 

Next, the Con_Group table was filled using three INSERT queries.  The first inserted all of the groups found in the Const_Group and Compound_Group tables.  The last query added one group called “None” which would be used with the BB tables in Con_Synonym.

 

Next, the Synonym_Group table was filled using four INSERT queries that simply added “Best Basis Inventory Long Name,” “Best Basis Inventory Short Name,” “Constituent Name,” and “Chemical_Name.”

 

The CCB_Log table was filled using one query that took data from the BBI_Summary table.  Only the records with the lowest Publish and Issuance dates for each CCB_Log number were used.  This made it possible to have one record for each CCB_Log.  Some data was lost as a result of this operation since one CCB_Log may have had multiple values in the corresponding fields.

 

The Tank table was filled using an INSERT and an UPDATE query.  The INSERT query selected and inserted the unique Tank_IDs from the BBI_Inventory table.  It clipped off the leading “241-“ for the Tank_ID, but left it on for the Tank_Alt_Name.  It also set each of the Tank_Types to “DST.”  The UPDATE query set the Tank_Type to “SST” for the “A” and “AX” tanks.

 

Filling the Con_Group_Member table required two INSERT queries.  The first joined the Con_Group, Const_Group, and Constituent tables to obtain the Con_Ids associated with each group.  The second used a similar approach but used the Compound_Group table instead of the Const_Group table.

 

Next, the Con_Synonym table was filled using four INSERT queries.  The first joined on Compound_Group, Con_Group, and Constituent and inserted the returned values, setting the Con_Syn_ID to the ID of the “Chemical Name” Con_Synonym.  The next three queries used the same idea, but set the Con_Syn_ID to the appropriate value.

 

The Inventory table was filled with one query that obtained most of its data from the BBI_Inventory table.  The Con_ID field was filled for each record by selecting the Con_ID from the Constituent table where Con_name was equal to Analyte_Name and the BBI_Summ_Flag was equal to “Yes”.

 

The Tank_Measure table was filled with a simple INSERT query that pulled the data straight from the WT_Summary table.


VI.                      REPORT RECREATION PROCESS

 

The Report Recreation Process consisted of the creation of five views that correspond to the five original reports.  A script was written to create the views (see Appendix D).   Each view was defined by a SELECT statement that queried data from multiple tables.

 

The Sample_Const_Groups view joins the Con_Synonym, Con_Group, Synonym_Group and Constituent tables where Synonym_Group is “Constituent Name”.  The Vapor_Comp_Groups view is similar, but selects records where Synonym Group is “Chemical Name”.  The BBI_Tank_Inven view joins the Inventory, Constituent, Con_Synonym, Synonym_Group, CCB_Log, and Tanks tables where Synonym_Group is “Best Basis Inventory Short Name.”  The Meas_Waste_Tank_Summ view joins the Tank_Measure and Tanks tables.  Finally, the BBI_Summary view joins the Inventory, Con_Synonym, Synonym_Group and Constituent tables where Synonym_Group is “Best Basis Inventory Short Name.”  The cross tab effect of the query is created using the Sum aggregate function and Case statements.  The long list of Sum statements was created using a query on the Tanks table, which concatenated the desired strings to the returned values.

 

The project appeared successful at this point.  The number of records returned by each view matched the number in the original reports.  A complete validation was not performed, but several comparisons where made between the views and reports.  No incongruences were found between the two, except for slight differences in some of the numerical fields.  This issue is discussed in the Issues Encountered section.

 

VII.                      ISSUES ENCOUNTERED

 

While using BCP to import the data from the reports, three of the reports imported successfully, while the other two did not.  The program printed “Truncation Error” to the screen and did not load any of the data from these tables.  This meant that one or more fields in the destination tables were smaller than the data that was being loaded into them.  It would have been helpful if the BCP program had supplied more information about where the error was occurring.  In order to find the fields that were too short, DTS was used to import the data.  It was found that DTS loaded the data without errors, but truncated those fields that were too long.  The culprit fields were found by manually searching the tables to find values that appeared truncated.  It seems that both BCP and DTS have room for improvement in the error reporting area.

 

Another issue encountered involved redundancy in the Constituent and Con_Synonym tables.  The Constituent table should have had one record for each CAS number, if the CAS number was available.  The data was transferred to the production tables in a way that attempted to dissolve this issue by only loading each constituent once.  This worked well except for the case where the Const_Group report contained multiple constituents with the same CAS number.  There were only two instances of this, but it was enough to cause some redundancy within the Constituent and Con_Synonym tables.  The Constituent redundancy directly caused the Con_Synonym redundancy since multiple Con_Ids were given to single constituents and the two tables were joined on the Con_ID.   This problem was not resolved because it was considered minor, and had no effect on the output of the reports.

 

The last issue encountered involved the conversion of a string value to a float value.  SQL Server exhibited a problem with this conversion, in that it did not produce a float value that exactly matched the string representation.  The float values often differed from the original by 1 x 10⁻⁹ or less.

 

 

VIII.                      PROPOSED FUTURE CHANGES

 

While the project was overall successful, there where some areas that could have been executed differently. First of all, the implemented design for the Production tables involved two fields that were not needed.  These were the BBI_Summ_Flag field of the Constituent table and the Comment field of the Con_Group table.  The BBI_Summ_Flag was not needed because the same information could be found using the Con_Synonym and Synonym_Group tables.  By joining with these tables where the Synonym_Group is “Best Basis Inventory Short Name” we can derive those constituents that were used in the BBT_Inventory and BBI_Summary reports.  The Comment field is not needed within the Con_Group table, because the comments within the Compound_Group report were associated with the Constituent not the Group.  The comment was handled in the Constituent table Comment field.

 

Another proposed change involves the Inventory and CCB_Log tables.  While the client required that the CCB_Log table should only include one record for each CCB_Log number, it would be more accurate to add the Doc_Number to the primary key.  The client requested that the record with the earliest Publish and Issue dates for each unique CCB_Log number should be used for the CCB_Log table.  While this may be an attempt by the client to clean up data problems, it appears to destroy the true relationship between the CCB_Log number and the Document number.  Not enough was known about the true nature of this data, so the client should have been asked to clarify this situation.

 


Appendix A (Temp Table Creation Script)

--create_tables.sql

 

 

IF EXISTS(SELECT name

                  FROM     sysobjects

                  WHERE  name = 'Const_Group'

                  AND       type = 'U')

    DROP TABLE Const_Group

GO

CREATE TABLE Const_Group

   (

                Constituent_Group varchar (20),

                Constituent_Name varchar(65),

                CAS_Number varchar (20)

   )

GO

 

 

IF EXISTS(SELECT name

                  FROM     sysobjects

                  WHERE  name = 'Compound_group'

                  AND       type = 'U')

    DROP TABLE Compound_group

GO

CREATE TABLE Compound_group

   ( 

                Element_Functionality_Group varchar (30),

                Chemical_Name varchar (100),

                Chemical_ID varchar (18),

                Comments_on_Classification varchar (60)  

   )

GO

 

 

IF EXISTS(SELECT name

                  FROM     sysobjects

                  WHERE  name = 'BB_Summary'

                  AND       type = 'U')

    DROP TABLE BB_Summary

GO

CREATE TABLE BB_Summary

   ( 

                Analyte varchar (10), Standard_Units_of_Measure varchar (2),

                A101 varchar (8), A102 varchar (8), A103 varchar (8), A104 varchar (8), A105 varchar (8),

                A106 varchar (8), AN101 varchar (8), AN102 varchar (8), AN103 varchar (8),

                AN104 varchar (8), AN105 varchar (8), AN106 varchar (8), AN107 varchar (8),

                AP101 varchar (8), AP102 varchar (8), AP103 varchar (8), AP104 varchar (8),

                AP105 varchar (8), AP106 varchar (8), AP107 varchar (8), AP108 varchar (8),

                AW101 varchar (8), AW102 varchar (8), AW103 varchar (8), AW104 varchar (8),

                AW105 varchar (8), AW106 varchar (8), AX101 varchar (8), AX102 varchar (8),

                AX103 varchar (8), AX104 varchar (8), AY101 varchar (8), AY102 varchar (8),

                AZ101 varchar (8), AZ102 varchar (8)

   )

GO

 

IF EXISTS(SELECT name

                  FROM     sysobjects

                  WHERE  name = 'BBT_Inven'

                  AND       type = 'U')

    DROP TABLE BBT_Inven

GO

CREATE TABLE BBT_Inven

   ( 

 

                Tank varchar (10), Analyte_Name varchar (25), Analyte varchar (15),

                Inventory varchar (10), Units varchar (2), Basis varchar (5), Comment varchar (225),

                Inventory_Type varchar (4), Effective_Date varchar (26), Publish_Date varchar (26),

                CCB_Log# varchar (12), Document_Number varchar (40),

                Document_Issuance_Date varchar (26),  SortOrder varchar (3)

    )

GO

 

 

IF EXISTS(SELECT name

                  FROM     sysobjects

                  WHERE  name = 'WT_Summary'

                  AND       type = 'U')

    DROP TABLE WT_Summary

GO

CREATE TABLE WT_Summary

   ( 

                TankName varchar (15), ReportDate varchar (26),

                Waste_Material varchar (35), Tank_Integrity varchar (15),

                Isolation_Status varchar (60), Total_Waste_Volume varchar (6),

                Supernatant_Volume varchar (6), Drainable_Interstitia_Volume varchar (6),

                Volume_Pumped_This_Month varchar (6), Total_Volume_Pumped varchar (6) ,

                Drainable_Liquid_Remaining varchar (6),

                Pumpable_Liquid_Remaining varchar (6), Sludge_Volume varchar (6),

                Salt_Cake_Volume varchar (6), Liquids_Volume_Method varchar (100),

                Solids_Volume_Method varchar (100), Solids_Volume_Updated varchar (26),

                Last_PhotoDate varchar (26), Last_VideoDate varchar (26),

                Space_Available varchar (6), DoubleShell_Slurry_DSS varchar (6),

                TankUse varchar (35), Equivalent_Waste_Inches varchar (50)

    )

 

Appendix B (Production Table Creation Script)

--create_prod_tables.sql

 

IF EXISTS(SELECT name

                  FROM     sysobjects

                  WHERE  name = 'Con_Synonym'

                  AND       type = 'U')

    DROP TABLE Con_Synonym

GO

IF EXISTS(SELECT name

                  FROM     sysobjects

                  WHERE  name = 'Con_Group_Member'

                  AND       type = 'U')

    DROP TABLE Con_Group_Member

GO

IF EXISTS(SELECT name

                  FROM     sysobjects

                  WHERE  name = 'Con_Group'

                  AND       type = 'U')

    DROP TABLE Con_Group

GO

 

IF EXISTS(SELECT name

                  FROM     sysobjects

                  WHERE  name = 'Synonym_Group'

                  AND       type = 'U')

    DROP TABLE Synonym_Group

GO

IF EXISTS(SELECT name

                  FROM     sysobjects

                  WHERE  name = 'Inventory'

                  AND       type = 'U')

    DROP TABLE Inventory

GO

IF EXISTS(SELECT name

                  FROM     sysobjects

                  WHERE  name = 'CCB_Log'

                  AND       type = 'U')

    DROP TABLE CCB_Log

GO

IF EXISTS(SELECT name

                  FROM     sysobjects

                  WHERE  name = 'Tank_Measure'

                  AND       type = 'U')

    DROP TABLE Tank_Measure

GO

IF EXISTS(SELECT name

                  FROM     sysobjects

                  WHERE  name = 'Tanks'

                  AND       type = 'U')

    DROP TABLE Tanks

GO

IF EXISTS(SELECT name

                  FROM     sysobjects

                  WHERE  name = 'Constituent'

                  AND       type = 'U')

    DROP TABLE Constituent

GO

CREATE TABLE Constituent

(

                Con_ID int identity(1,1),

                Con_Name varchar(100),

                Con_Unit varchar(3),

                Con_CAS varchar(11),

                BBI_Summ_Flag varchar(1),

                Comment varchar(255),

 

                CONSTRAINT Con_ID_PK PRIMARY KEY (Con_ID)

 

)

GO

 

CREATE TABLE Con_Group

(

                Con_Group_ID int identity(1,1),

                Con_Group_Name varchar(30),

                Comment varchar(255),

 

                CONSTRAINT Con_Group_PK PRIMARY KEY (Con_Group_ID)

)

GO

 

CREATE TABLE Con_Group_Member

(

                Con_Group_ID int

                FOREIGN KEY REFERENCES Con_Group(Con_Group_ID),

                Con_ID int

                FOREIGN KEY REFERENCES Constituent(Con_ID)

                ON DELETE NO ACTION,

 

                CONSTRAINT Con_Group_ID_PK PRIMARY KEY (Con_Group_ID, Con_ID)

)

GO

 

CREATE TABLE Synonym_Group

(

                Con_Syn_ID int identity(1,1),

                Synonym_Group varchar(100)

 

                CONSTRAINT Syn_Grp_ID_PK PRIMARY KEY (Con_Syn_ID)

)

GO

 

CREATE TABLE Con_Synonym

(

                Con_Syn_ID int

                FOREIGN KEY REFERENCES Synonym_Group(Con_Syn_ID),

                Con_ID int

                FOREIGN KEY REFERENCES Constituent(Con_ID),

                Con_Group_ID int

                FOREIGN KEY REFERENCES Con_Group(Con_Group_ID),

                Synonym varchar(100),

 

                CONSTRAINT Con_Syn_ID_PK PRIMARY KEY (Con_Syn_ID, Con_ID, Con_Group_ID)

)

GO

 

CREATE TABLE Tanks

(

                Tank_ID varchar(6),

                Tank_Alt_Name varchar(10),

                Tank_Type varchar(3),

 

                CONSTRAINT Tank_ID_PK PRIMARY KEY (Tank_ID)

)

GO

 

CREATE TABLE CCB_Log

(

                CCB_Log_ID varchar(15),

                Doc_Number varchar(50),

                Doc_Issue_Date datetime,

                Effective_Date datetime,