By Aaron Lowe
5/3/2003
WSU Tri-Cities
CptS 451
TABLE OF CONTENTS
Best Basis Inventory Subject Area
General Description
Measurements Subject Area General
Description
Safety Subject Area General
Description
Sample Analysis Subject Area General
Description
Vapor Subject Area General
Description
Importation of Data into Temporary
Tables
Appendix A (Temp Table Creation Script)
Appendix B (Production Table Creation Script)
Five reports have been created using the Tank Waste Information System (TWINS). The goal of this project is to design a normalized relational database to store the data from the five reports and provide a way to then recreate the reports from the database. The first task is importing the data into the database. This will involve the creation of tables and the transformation of the data from the report format to the database schema. Next the reports will be recreated from the database using views. The project will be deemed successful when the reports are accurately recreated from a sufficiently normalized database.
The Best Basis Inventory Subject Area contains best-basis inventory estimates for chemical and radionuclide components in the 177 Hanford Site underground storage tanks. Inventories are presented on both a Tank-by-tank and global (total) basis. The tank-by-tank best-basis waste inventories include 25 chemical and 46 radionuclide components for each of the 177 double- and single-shell underground waste storage tanks. The global waste inventories include five chemicals in addition to the chemicals and radionuclides reported in the tank-by-tank inventories. The best-basis global inventories are independent estimates of the total amount of a chemical or radionuclide component in all tanks. The chemical analyte list selected represents 99 wt% of the tank contents, and the radionuclides represent over 99% of the activity. Information used to establish global inventories originated from key historical records (e.g., essential material purchase records), from various chemical flow-sheets used in reprocessing of irradiated Hanford Site reactor fuels, and from calculations of radionuclide isotope generation and decay. Tank-by-tank inventories are most often based on sample analysis results. Best-basis inventories now serve as waste composition data for Tank Waste Remediation System process flow-sheet modeling work, safety analyses, risk assessment, and waste retrieval, treatment, and disposal system design.
The Measurements Subject Area contains data from the Surveillance Analysis Computer System (SACS) located in the 200 Area at Hanford.
Measurement types are:
#Surface Level
#Temperature (multiple sensor, thermocouple tree)
#Temperature (single sensor)
#Interstitial Liquid Level
These data are collected by a variety of means, including manual collection by operators in the field and automatic collection by the TMACS computer system which is a primary tank farm operations monitoring system. Data from TMACS is periodically transferred to SACS. The SACS data available through the Measurements Subject Area of TWINS2 is generally daily averages. A data "Quality Indicator" is always set by SACS to document the data quality.
The 177 underground waste storage tanks at Hanford, with a total capacity of 55 million gallons, create safety concerns that the Project Hanford Management Contractor (PHMC) is resolving. In FY 1997, projects are examining organics and flammable gas safety issues.
The Organics Tank Safety Project is investigating the hazards associated with the storage of organic complexant salts in the underground single-shell storage tanks. Evaluation of the double-shell tanks us not included in this investigation because these tanks contain appreciable amounts of aqueous waste and cannot support organic-complexant combustion under current storage conditions.
The Flammable Gas Safety Program investigates the generation, retention and release of flammable gas (ie. hydrogen, ammonia) in Hanford single-shell and double-shell underground waste tanks.
The Sample Analysis Subject Area contains data obtained from laboratory "data package" reports on a per-tank basis. Information includes Sampling Event data, Sample data, Analysis Method data, and Analytical Results data. Analytical Results include primary/duplicate/triplicate and other results, plus "QA results"; that is, results of tests that were performed for the purpose of assessing the performance of or calibrating the laboratory instrument. Sample Analysis data produced by the TWRS Program since about May 1989, is contained in the tables named, "Tank Results (hide QA data)", and , "Tank Results (show QA data)". In addition, previously collected laboratory data gleaned from letter reports and other sources is stored in the table named, "Historical Analytical Data". This constitutes all of the available Sample Analysis data obtained for the purpose of tank waste characterization for the Hanford high-level waste tanks.
Analytical Results are generally reported as concentrations - that is, the concentration of a constituent that was found by analyzing a sample using a specified testing procedure. The content of the Hanford tanks is often extremely inhomogeneous, and so the concentration of a given constituent as reported in the Analytical Results can vary greatly from sample to sample reflecting different areas or layers within the tank, or sample in-homogeneity.
In addition, a given constituent may be analyzed for by more than one method, yielding different results. Interpretation of these results requires chemical and statistical expertise.
Tank transfer data is also included in the Sample Analysis Subject Area even though it is not laboratory-generated analytical data. Tank Transfer data is data that describes transfer events where material is transferred to or from waste tanks, thereby changing their contents. This information, including a description of the waste type, is contained in the "Tank Transfers" table. Also, a "Transfer Since Sample" field is set in all tables containing analytical results whenever the sample date of the analytical result is earlier than the Tank Transfer date.
Tank headspace characterization data and certain process test data obtained after 1992 are available in the Vapor Subject Area. Sample collection data (e.g., date, time, sampling method, tank riser, etc.) and sample analysis data (e.g., analytical laboratory, analysis date, chemical names, reported value, units, reference temperature, concentration in mg/m3 at 25C, etc.), averaged analytical reported values, and percents of the lower flammability limit (LFL) for selected analytes are included. Headspace flammability as total percent LFL is calculated by summing the percents of LFL for ammonia, carbon monoxide, hydrogen, methane, and total non-methane hydrocarbons.
Headspace characterization data are typically based on a single set of samples collected during a single sampling event, though some tanks have been sampled more than once. Most tank headspace samples have been collected in either SUMMA® canisters or sorbent traps, using the Vapor Sampling System (VSS), the In Situ Vapor Sampling (ISVS), or the In Situ Sampling (ISS) methods. Samples were collected from the central regions of the tank headspaces, and are thought to be representative of the headspace at the time of sampling. Samples were analyzed for inorganic analytes (NH3, NO2, NO, H2O), permanent gases (CO2, CO, CH4, H2, N2O), total non-methane hydrocarbons, and individual volatile and semi volatile organic compounds by analytical laboratories using prescribed procedures.
In addition to results from headspace characterization events, permanent gas concentrations from analyses of Standard Hydrogen Monitoring System
(SHMS) Verification Grab Samples are provided. Also, limited data from industrial hygiene field monitoring of the tank headspaces, including combustible gas meter readings, are provided. Also, results from headspace and ventilation system samples collected during process testing are also included in the Test Data tables.
The import process consisted of the following four steps:
1. Creating temporary tables to hold report data.
2. Importing the report data into the temporary tables.
3. Creating the production tables.
4. Transforming the data and importing it into the production tables.
An SQL script was used to create the following tables (see Appendix A):
The reports had to be in a tab delimited text file before importation. All of the files, except for two, where originally in the correct format. The BBI_Summary and BBI_Tank_Inventory reports had to be converted from excel format to the delimited text format. This was done by opening the files with excel and saving them as text files. The inventory columns displayed decimal numbers in scientific notation, so these had to be reformatted to text, before conversion, in order to retain the underlying precision.
The following calls to BCP were made from a command prompt:
Bcp aLOWE..CONST_GROUP in SAMPLE_CONST_GROUPS.txt /F 2 /c /U alowe_db /P dog49tank /S WSUTCSQL
Bcp aLOWE..COMPOUND_GROUP in VAPOR_COMP_GROUPS.txt /F 2 /c /U alowe_db /P dog49tank /S WSUTCSQL
Bcp aLOWE..WT_SUMMARY in MEAS_WASTE_TANK_SUMM.txt /F 2 /c /U alowe_db /P dog49tank /S WSUTCSQL
Bcp aLOWE..BB_SUMMARY in BBI_SUMMARY.txt /F 2 /c /U alowe_db /P dog49tank /S WSUTCSQL
Bcp aLOWE..BBT_INVEN in BBI_TANK_INVEN.txt /F 2 /c /U alowe_db /P dog49tank /S WSUTCSQL
The “/F 2” parameter tells the program to start on the second line. This was used because the first line in the reports contained the field names. The “/c” parameter informs the program to use the Character format when importing the text files. The “/U” and “/P” parameters where used to provide the username and password for the SQL Server instance, which was designated with the “/S” parameter. After the import process, it was confirmed that the temporary tables contained the same number of records as the original reports and that none of the none of the data had been distorted or truncated.
An SQL script was used to create the following tables (see Appendix B):

The final step in the import process was transferring the data from the temporary tables to the production tables. This required a string of INSERT and UPDATE queries that transformed and inserted the data (see Appendix C).
The Constituent table was loaded using three INSERT queries and one UPDATE query. The first INSERT query loaded data from the Compound_Group table into the Constituent table and set the BBI_Summ_Flag to “no.” The next query inserted the records from the Const_Group table where the Con_Name did not already exist in the Constituent table and set BBI_Summ_Flag to “no.” This query attempted to reduce the overlapping records that might exist in the Compound_Group and Const_Group reports. The next query inserted Con_Names from the BBI_Inventory. This was done to make sure all constituents were loaded into the Constituent Table, even if they only existed in the BBI_Inventory. The final query updated the Constituent table by setting the BBI_Summ_Flag to “yes” where the Con_Name was in the BBI_Inventory table.
Next, the Con_Group table was filled using three INSERT queries. The first inserted all of the groups found in the Const_Group and Compound_Group tables. The last query added one group called “None” which would be used with the BB tables in Con_Synonym.
Next, the Synonym_Group table was filled using four INSERT queries that simply added “Best Basis Inventory Long Name,” “Best Basis Inventory Short Name,” “Constituent Name,” and “Chemical_Name.”
The CCB_Log table was filled using one query that took data from the BBI_Summary table. Only the records with the lowest Publish and Issuance dates for each CCB_Log number were used. This made it possible to have one record for each CCB_Log. Some data was lost as a result of this operation since one CCB_Log may have had multiple values in the corresponding fields.
The Tank table was filled using an INSERT and an UPDATE query. The INSERT query selected and inserted the unique Tank_IDs from the BBI_Inventory table. It clipped off the leading “241-“ for the Tank_ID, but left it on for the Tank_Alt_Name. It also set each of the Tank_Types to “DST.” The UPDATE query set the Tank_Type to “SST” for the “A” and “AX” tanks.
Filling the Con_Group_Member table required two INSERT queries. The first joined the Con_Group, Const_Group, and Constituent tables to obtain the Con_Ids associated with each group. The second used a similar approach but used the Compound_Group table instead of the Const_Group table.
Next, the Con_Synonym table was filled using four INSERT queries. The first joined on Compound_Group, Con_Group, and Constituent and inserted the returned values, setting the Con_Syn_ID to the ID of the “Chemical Name” Con_Synonym. The next three queries used the same idea, but set the Con_Syn_ID to the appropriate value.
The Inventory table was filled with one query that obtained most of its data from the BBI_Inventory table. The Con_ID field was filled for each record by selecting the Con_ID from the Constituent table where Con_name was equal to Analyte_Name and the BBI_Summ_Flag was equal to “Yes”.
The Tank_Measure table was filled with a simple INSERT query that pulled the data straight from the WT_Summary table.
The Report Recreation Process consisted of the creation of five views that correspond to the five original reports. A script was written to create the views (see Appendix D). Each view was defined by a SELECT statement that queried data from multiple tables.
The Sample_Const_Groups view joins the Con_Synonym, Con_Group, Synonym_Group and Constituent tables where Synonym_Group is “Constituent Name”. The Vapor_Comp_Groups view is similar, but selects records where Synonym Group is “Chemical Name”. The BBI_Tank_Inven view joins the Inventory, Constituent, Con_Synonym, Synonym_Group, CCB_Log, and Tanks tables where Synonym_Group is “Best Basis Inventory Short Name.” The Meas_Waste_Tank_Summ view joins the Tank_Measure and Tanks tables. Finally, the BBI_Summary view joins the Inventory, Con_Synonym, Synonym_Group and Constituent tables where Synonym_Group is “Best Basis Inventory Short Name.” The cross tab effect of the query is created using the Sum aggregate function and Case statements. The long list of Sum statements was created using a query on the Tanks table, which concatenated the desired strings to the returned values.
The project appeared successful at this point. The number of records returned by each view matched the number in the original reports. A complete validation was not performed, but several comparisons where made between the views and reports. No incongruences were found between the two, except for slight differences in some of the numerical fields. This issue is discussed in the Issues Encountered section.
While using BCP to import the data from the reports, three of the reports imported successfully, while the other two did not. The program printed “Truncation Error” to the screen and did not load any of the data from these tables. This meant that one or more fields in the destination tables were smaller than the data that was being loaded into them. It would have been helpful if the BCP program had supplied more information about where the error was occurring. In order to find the fields that were too short, DTS was used to import the data. It was found that DTS loaded the data without errors, but truncated those fields that were too long. The culprit fields were found by manually searching the tables to find values that appeared truncated. It seems that both BCP and DTS have room for improvement in the error reporting area.
Another issue encountered involved redundancy in the Constituent and Con_Synonym tables. The Constituent table should have had one record for each CAS number, if the CAS number was available. The data was transferred to the production tables in a way that attempted to dissolve this issue by only loading each constituent once. This worked well except for the case where the Const_Group report contained multiple constituents with the same CAS number. There were only two instances of this, but it was enough to cause some redundancy within the Constituent and Con_Synonym tables. The Constituent redundancy directly caused the Con_Synonym redundancy since multiple Con_Ids were given to single constituents and the two tables were joined on the Con_ID. This problem was not resolved because it was considered minor, and had no effect on the output of the reports.
The last issue encountered involved the conversion of a string value to a float value. SQL Server exhibited a problem with this conversion, in that it did not produce a float value that exactly matched the string representation. The float values often differed from the original by 1 x 10⁻⁹ or less.
While the project was overall successful, there where some areas that could have been executed differently. First of all, the implemented design for the Production tables involved two fields that were not needed. These were the BBI_Summ_Flag field of the Constituent table and the Comment field of the Con_Group table. The BBI_Summ_Flag was not needed because the same information could be found using the Con_Synonym and Synonym_Group tables. By joining with these tables where the Synonym_Group is “Best Basis Inventory Short Name” we can derive those constituents that were used in the BBT_Inventory and BBI_Summary reports. The Comment field is not needed within the Con_Group table, because the comments within the Compound_Group report were associated with the Constituent not the Group. The comment was handled in the Constituent table Comment field.
Another proposed change involves the Inventory and CCB_Log tables. While the client required that the CCB_Log table should only include one record for each CCB_Log number, it would be more accurate to add the Doc_Number to the primary key. The client requested that the record with the earliest Publish and Issue dates for each unique CCB_Log number should be used for the CCB_Log table. While this may be an attempt by the client to clean up data problems, it appears to destroy the true relationship between the CCB_Log number and the Document number. Not enough was known about the true nature of this data, so the client should have been asked to clarify this situation.
--create_tables.sql
IF EXISTS(SELECT name
FROM sysobjects
WHERE
name = 'Const_Group'
AND type = 'U')
DROP
TABLE Const_Group
GO
CREATE TABLE Const_Group
(
Constituent_Group
varchar (20),
Constituent_Name
varchar(65),
CAS_Number
varchar (20)
)
GO
IF EXISTS(SELECT name
FROM sysobjects
WHERE
name = 'Compound_group'
AND type = 'U')
DROP
TABLE Compound_group
GO
CREATE TABLE Compound_group
(
Element_Functionality_Group
varchar (30),
Chemical_Name
varchar (100),
Chemical_ID
varchar (18),
Comments_on_Classification
varchar (60)
)
GO
IF EXISTS(SELECT name
FROM sysobjects
WHERE
name = 'BB_Summary'
AND type = 'U')
DROP
TABLE BB_Summary
GO
CREATE TABLE BB_Summary
(
Analyte
varchar (10), Standard_Units_of_Measure varchar (2),
A101
varchar (8), A102 varchar (8), A103 varchar (8), A104 varchar (8), A105 varchar
(8),
A106
varchar (8), AN101 varchar (8), AN102 varchar (8), AN103 varchar (8),
AN104
varchar (8), AN105 varchar (8), AN106 varchar (8), AN107 varchar (8),
AP101
varchar (8), AP102 varchar (8), AP103 varchar (8), AP104 varchar (8),
AP105
varchar (8), AP106 varchar (8), AP107 varchar (8), AP108 varchar (8),
AW101
varchar (8), AW102 varchar (8), AW103 varchar (8), AW104 varchar (8),
AW105
varchar (8), AW106 varchar (8), AX101 varchar (8), AX102 varchar (8),
AX103
varchar (8), AX104 varchar (8), AY101 varchar (8), AY102 varchar (8),
AZ101
varchar (8), AZ102 varchar (8)
)
GO
IF EXISTS(SELECT name
FROM sysobjects
WHERE
name = 'BBT_Inven'
AND type = 'U')
DROP
TABLE BBT_Inven
GO
CREATE TABLE BBT_Inven
(
Tank
varchar (10), Analyte_Name varchar (25), Analyte varchar (15),
Inventory
varchar (10), Units varchar (2), Basis varchar (5), Comment varchar (225),
Inventory_Type
varchar (4), Effective_Date varchar (26), Publish_Date varchar (26),
CCB_Log#
varchar (12), Document_Number varchar (40),
Document_Issuance_Date
varchar (26), SortOrder varchar (3)
)
GO
IF EXISTS(SELECT name
FROM sysobjects
WHERE
name = 'WT_Summary'
AND type = 'U')
DROP
TABLE WT_Summary
GO
CREATE TABLE WT_Summary
(
TankName
varchar (15), ReportDate varchar (26),
Waste_Material
varchar (35), Tank_Integrity varchar (15),
Isolation_Status
varchar (60), Total_Waste_Volume varchar (6),
Supernatant_Volume
varchar (6), Drainable_Interstitia_Volume varchar (6),
Volume_Pumped_This_Month
varchar (6), Total_Volume_Pumped varchar (6) ,
Drainable_Liquid_Remaining
varchar (6),
Pumpable_Liquid_Remaining
varchar (6), Sludge_Volume varchar (6),
Salt_Cake_Volume
varchar (6), Liquids_Volume_Method varchar (100),
Solids_Volume_Method
varchar (100), Solids_Volume_Updated varchar (26),
Last_PhotoDate
varchar (26), Last_VideoDate varchar (26),
Space_Available
varchar (6), DoubleShell_Slurry_DSS varchar (6),
TankUse
varchar (35), Equivalent_Waste_Inches varchar (50)
)
--create_prod_tables.sql
IF EXISTS(SELECT name
FROM sysobjects
WHERE
name = 'Con_Synonym'
AND type = 'U')
DROP
TABLE Con_Synonym
GO
IF EXISTS(SELECT name
FROM sysobjects
WHERE
name = 'Con_Group_Member'
AND type = 'U')
DROP
TABLE Con_Group_Member
GO
IF EXISTS(SELECT name
FROM sysobjects
WHERE
name = 'Con_Group'
AND type = 'U')
DROP
TABLE Con_Group
GO
IF EXISTS(SELECT name
FROM sysobjects
WHERE
name = 'Synonym_Group'
AND type = 'U')
DROP
TABLE Synonym_Group
GO
IF EXISTS(SELECT name
FROM sysobjects
WHERE
name = 'Inventory'
AND type = 'U')
DROP
TABLE Inventory
GO
IF EXISTS(SELECT name
FROM sysobjects
WHERE
name = 'CCB_Log'
AND type = 'U')
DROP
TABLE CCB_Log
GO
IF EXISTS(SELECT name
FROM sysobjects
WHERE
name = 'Tank_Measure'
AND type = 'U')
DROP
TABLE Tank_Measure
GO
IF EXISTS(SELECT name
FROM sysobjects
WHERE
name = 'Tanks'
AND type = 'U')
DROP
TABLE Tanks
GO
IF EXISTS(SELECT name
FROM sysobjects
WHERE
name = 'Constituent'
AND type = 'U')
DROP
TABLE Constituent
GO
CREATE TABLE Constituent
(
Con_ID
int identity(1,1),
Con_Name
varchar(100),
Con_Unit
varchar(3),
Con_CAS
varchar(11),
BBI_Summ_Flag
varchar(1),
Comment
varchar(255),
CONSTRAINT
Con_ID_PK PRIMARY KEY (Con_ID)
)
GO
CREATE TABLE Con_Group
(
Con_Group_ID
int identity(1,1),
Con_Group_Name
varchar(30),
Comment
varchar(255),
CONSTRAINT
Con_Group_PK PRIMARY KEY (Con_Group_ID)
)
GO
CREATE TABLE Con_Group_Member
(
Con_Group_ID
int
FOREIGN
KEY REFERENCES Con_Group(Con_Group_ID),
Con_ID
int
FOREIGN
KEY REFERENCES Constituent(Con_ID)
ON
DELETE NO ACTION,
CONSTRAINT
Con_Group_ID_PK PRIMARY KEY (Con_Group_ID, Con_ID)
)
GO
CREATE TABLE Synonym_Group
(
Con_Syn_ID
int identity(1,1),
Synonym_Group
varchar(100)
CONSTRAINT
Syn_Grp_ID_PK PRIMARY KEY (Con_Syn_ID)
)
GO
CREATE TABLE Con_Synonym
(
Con_Syn_ID
int
FOREIGN
KEY REFERENCES Synonym_Group(Con_Syn_ID),
Con_ID
int
FOREIGN
KEY REFERENCES Constituent(Con_ID),
Con_Group_ID
int
FOREIGN
KEY REFERENCES Con_Group(Con_Group_ID),
Synonym
varchar(100),
CONSTRAINT
Con_Syn_ID_PK PRIMARY KEY (Con_Syn_ID, Con_ID, Con_Group_ID)
)
GO
CREATE TABLE Tanks
(
Tank_ID
varchar(6),
Tank_Alt_Name
varchar(10),
Tank_Type
varchar(3),
CONSTRAINT
Tank_ID_PK PRIMARY KEY (Tank_ID)
)
GO
CREATE TABLE CCB_Log
(
CCB_Log_ID
varchar(15),
Doc_Number
varchar(50),
Doc_Issue_Date
datetime,
Effective_Date
datetime,