anything
AirNav RadarBox
Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
 


Author Topic: SQL help required  (Read 10033 times)

0 Members and 4 Guests are viewing this topic.

testmonkey

  • Full Member
  • ***
  • Posts: 127
SQL help required
« on: August 09, 2008, 08:01:12 PM »
Hi all,

I've got myself a copy of SQLite Database Browser so that I can try and fix a few incorrect details in some of the tables. My current focus is trying to correct all of the records in the Aircraft table where the company and construction number information is mixed up. I've taught myself enough SQL to extract the first batch of records to correct and modify them but can't for the life of me work out how to get them back into the Aircraft table correctly.

Here's what I've done so far:

Created 2 backups of NavData.db3 - one to edit and the other one as a spare in case my edited one doesn't work in RB.

1. Created a new table with the same layout as the Aircraft table (plus a spare field to use to swap data around).

CREATE TABLE aircraft2
(
MS varchar(6),
AR varchar(50),
AT varchar(80),
AN varchar(80),
AC varchar(80),
CN varchar(80),
PT varchar(80),
LK varchar(80),
PT2 varchar(80),
LK2 varchar(80),
MINE varchar(80)
)

2. Copy all records from the Aircraft table where the company name starts with 1.

INSERT INTO aircraft2 (ms,ar,at,an,ac,cn,pt,lk,pt2,lk2)
SELECT ms,ar,at,an,ac,cn,pt,lk,pt2,lk2
FROM aircraft
WHERE ac LIKE '1%'

I then manually go through this table to check for any records that have a legitimate company name that starts 1 and delete them.

3. I then swap the data in the AC and CN fields around with these 3 statements

UPDATE aircraft2
SET mine = ac

UPDATE aircraft2
SET ac = cn

UPDATE aircraft2
SET cn = mine

Now I need to work out how to import the corrected data back into the correct records in the Aircraft table. I think I need something involving an UPDATE and an INNER JOIN but can't figure it out. I'm sure the code above could be improved as well!

This process would then be repeated for all records where the company name began 2, 3, etc.... and also E1, E2 and any others I find.

Any help gratefully recieved.

Thanks
testmonkey
--------------------------------------------
RB - MVT9000 - UBC30XLT (for acars)

testmonkey

  • Full Member
  • ***
  • Posts: 127
Re: SQL help required
« Reply #1 on: August 10, 2008, 01:33:51 PM »
It's okay I've sorted it now.

REPLACE INTO aircraft (MS,AR,AT,AN,AC,CN,PT,LK,PT2,LK2)
SELECT MS,AR,AT,AN,AC,CN,PT,LK,PT2,LK2
FROM aircraft2

Hopefully someone else can make use of this as well. Will make logging more accurate and allow better searching but will need to be run fairly regularly.
testmonkey
--------------------------------------------
RB - MVT9000 - UBC30XLT (for acars)

Allocator

  • RadarBox24.com Beta Testers
  • Hero Member
  • *
  • Posts: 3568
Re: SQL help required
« Reply #2 on: August 10, 2008, 02:33:28 PM »
Crikey, that looks clever - I'll give it a go on a backup copy of my NavData.db3 :-)

bailey_uk

  • Full Member
  • ***
  • Posts: 241
Re: SQL help required
« Reply #3 on: August 10, 2008, 05:36:29 PM »
Ah SQL takes me back! Did it at A-Level ICT, fries my brain!
Possibly the youngest RadarBox Owner!

DaveReid

  • Hero Member
  • *****
  • Posts: 1815
    • Heathrow last 100 ADS-B arrivals
Re: SQL help required
« Reply #4 on: August 12, 2008, 12:35:18 PM »
3. I then swap the data in the AC and CN fields around with these 3 statements

UPDATE aircraft2
SET mine = ac

UPDATE aircraft2
SET ac = cn

UPDATE aircraft2
SET cn = mine

You can swap values in a SQL table with a single statement:

UPDATE aircraft2
SET ac = cn, cn = ac

HTH
Dave
This post has been scanned for any traces of negativity, bias, sarcasm and general anti-social behaviour

testmonkey

  • Full Member
  • ***
  • Posts: 127
Re: SQL help required
« Reply #5 on: August 12, 2008, 05:43:33 PM »
You can swap values in a SQL table with a single statement:

UPDATE aircraft2
SET ac = cn, cn = ac

HTH
Dave

Thanks Dave though I have to admit I did figure that one out eventually after trying to put them on seperate lines within 1 script and getting an error. Any other tips gratefully received though.
testmonkey
--------------------------------------------
RB - MVT9000 - UBC30XLT (for acars)

testmonkey

  • Full Member
  • ***
  • Posts: 127
Re: SQL help required
« Reply #6 on: August 12, 2008, 06:21:37 PM »
I'm now wondering if I'm going too far with this!

Essentially I'd like to get the NavData database as clean as possible ready to start a new MyLog when RB2009 (v1.6) comes out shortly. The original database that we start with is pretty clean I think but I've already added several thousand new planes to that, many of which do need some cleaning or extra data.

Here's my current list of objectives:

a. Check the company names for instances starting 1 to 0, containing 1 to 0, blanks and null values as all of these could indicate a record where the company name is wrongly in the construction / serial number field. If so switch the fields around.
b. Check the aircraft type code field (i.e. B738) for ..., blank or null values and enter the correct codes.
c. Check the registration field for ..., null or blank values and search for the ModeS code online to fill in the missing value(s) - (often only have the ModeS code in these cases).
d. Check the aircraft name field for ..., blank, null or 'Tie-up extracted from Country Sequence' values and fill in any missing details.
e. Check for records where the company name is set to 'Untitled (airline_name)' and change them to the airline name only.
f. Search for company names that are represented in various forms i.e. Easyjet, Easyjet Airlines, Easyjet Airline Company etc. and change them to one standard value. Careful with Easyjet of course that we set the Swiss ones seperately.

Here's the script for those that want it to change the instances in E above where Untitled exists in the company name field:

UPDATE aircraft2
SET mine = SUBSTR(ac,10,length(ac)-10)
WHERE ac LIKE 'untitled(%'

where mine is my spare field I use to check the corrections before commiting them back to the company name field. Use this if the name is 'Untitled(blahblah)' or use

UPDATE aircraft2
SET mine = SUBSTR(ac,11,length(ac)-11)
WHERE ac LIKE 'untitled (%'

where there is a space between the Untitled and first ( such as 'Untitled (blahblah)'.

So the questions are;

1. Am I going too far?
2. Has anyone else performed a similar exercise?
3. Does anyone want me to post all the scripts I'm using to this thread with explanations for each one so they can do the same?
4. Why!?! :)))))

Right, back to editing. I reckon I've got a few thousand to do still and not enough spare time.
testmonkey
--------------------------------------------
RB - MVT9000 - UBC30XLT (for acars)

GlynH

  • Hero Member
  • *****
  • Posts: 612
Re: SQL help required
« Reply #7 on: August 12, 2008, 07:51:03 PM »
Your knowledge on SQL far exceeds mine...

Something I wanted to do - and have achieved with some measure of success - was the merging of two databases.

The reason behind this was when I bought RB I installed it on my main desktop machine and ran it for a few days.

During that time I received over 3,500 contacts.

I then moved it to a server spec machine in my loft and being new to the game didn't copy over my database - not sure if that would have worked without changing anything else though?

During the next week I received over 8,500 contacts but noticed that some interesting ones in my first database had not been seen again i.e. Vulcan and it would be a good idea to merge them as I intended to keep the RB on this second machine.

These were duly merged but I forgot to take into account the duplicate contacts as I was focussing on not losing any aircraft and so I was over the moon to see the exact total of both databases reflected in the new one which was @ 70MB in size!

When I browsed MyLog however iin the top window I noticed many aircraft had two entries.

Then it struck me that a straight merge was no good - I really needed to perform a 'Search for duplicate entries - delete one entry BUT take into account First Time & Last Time.

This should effectively remove the duplicate aircraft entry in the top window and modify the First Time / Last Time to record the earliest & latest times recorded from the two entries - if that makes sense.

/*
THIS IS FIRST COMMAND!!!

It change duplicates FirstTime on early date */

update Aircraft
set FirstTime =
(
select min (FirstTime)
from Aircraft a2
where a2.ModeS = Aircraft.ModeS
);


/*

THIS IS SECOND COMMAND!!!

It delete duplicates from aircrafts and leave only aircrafts with late LastTime

*/

delete from Aircraft
where exists
(
  select *
  from Aircraft a2
  where (a2.ModeS = Aircraft.ModeS) and (Aircraft.LastTime < a2.LastTime)
)

This might have worked but I have never got round to implementing the changes as RadarBox started to slow to a crawl and became unusable...

There is an excellent tool for dealing with this type of thing called SQLite Analyzer available from www.kraslabs.com and their support of the program is above & beyond the call of duty.

I know...I needed it! :-)

So yes...please post your scripts...

Does anyone know if there is a limit as to the database size in RadarBox that might cause the program to slow down to a crawl?

Thanks & regards,
-=Glyn=-
« Last Edit: August 13, 2008, 04:13:16 PM by GlynH »

testmonkey

  • Full Member
  • ***
  • Posts: 127
Re: SQL help required
« Reply #8 on: August 13, 2008, 12:28:24 AM »
Your knowledge on SQL far exceeds mine...

I wish GlynH - I only started learning it Saturday evening so I could do this!

Anyway here goes with the scripts and remember boys and girls to make 2 back-ups first - 1 to make your changes to and the second as a replacement if the edited one fails to start RB (RB actually creates its own backup each time it's ran 'NavData.db3.bak' that you could use).

First off I use this script to create a second aircraft table to do all the changes in:

CREATE TABLE aircraft2
(
MS varchar(6),
AR varchar(50),
AT varchar(80),
AN varchar(80),
AC varchar(80),
CN varchar(80),
PT varchar(80),
LK varchar(80),
PT2 varchar(80),
LK2 varchar(80),
MINE varchar(80)
)

This table can be left in the database from this point forward as RB will just ignore it. The mine column isn't in the original aircraft table but I've added it so that I can use it as a temporary place to put changes for checking before I put them back into the column they belong in.

After doing each set of changes you will need to run this script to commit them back into the aircraft table:

REPLACE INTO aircraft (MS,AR,AT,AN,AC,CN,PT,LK,PT2,LK2)
SELECT MS,AR,AT,AN,AC,CN,PT,LK,PT2,LK2
FROM aircraft2

Then you'll need to run this script to clear the aircraft2 table ready for your next set of changes:

DELETE FROM aircraft2

If you want to remove the aircraft2 table just use

DROP TABLE aircraft2

but as I say it's not necessary. Obviously don't forget to copy the amended table back into your RB folder before running RB again.

Okay, that's the admin scripts done. Now for the actual changes.

--------
Let's start with the scripts for fixing aircraft that have the company name switched with the construction number. Run this first:

INSERT INTO aircraft2 (MS,AR,AT,AN,AC,CN,PT,LK,PT2,LK2)
SELECT MS,AR,AT,AN,AC,CN,PT,LK,PT2,LK2
FROM aircraft
WHERE AC LIKE '1%'

This will move into the aircraft2 table any aircraft where the company name starts with '1'. You'll have to manually browse these records and delete any where the company name field is actually correct i.e 1st Air Company so you're just left with the records that need changing. Now run:

UPDATE aircraft2
SET mine = ac, ac = cn, cn = mine

All of the necesary records should now be corrected so run the second and third admin scripts to apply the changes to the aircraft table and then clear the aircraft2 table.

Now repeat changing 1% to 2%, then 3% .......... 0% to capture construction numbers starting 2,3 through to 0. Continue to repeat using %1%, %2% etc ....... %0% to capture records where the company name currently contains a number. This is to cover records where the construction number starts with a letter (or more) instead. You can also try replacing 1% with ... or empty quotes '' [blank] or null - this last one will usually bring back records with only a ModeS code or ModeS and reg which we'll correct later.

---------------------
---------------------

Next we'll fix aircraft records that have a ModeS code but no other aircraft details. Run this script to extract them to the aircraft2 table:

INSERT INTO aircraft2 (MS,AR,AT,AN,AC,CN,PT,LK,PT2,LK2)
SELECT MS,AR,AT,AN,AC,CN,PT,LK,PT2,LK2
FROM aircraft
WHERE ar LIKE 'null'

Now do an internet search on each ModeS code in the aircraft2 table to find it's other details (I tend to use airframes.org). Enter the details into this script to apply each one:

UPDATE aircraft2
SET at='B738', an='Boeing 737-86N', ac='Spicejet', cn='32672', ar='VT-SPW'
WHERE ms='0D03F1'

at = four letter aircraft type code
an = aircraft type name
ac = company / owner
cn = construction number
ar = aircraft registration
ms = ModeS code we searched on

Once complete apply the changes to the aircraft table and clean aircraft2 again using the last 2 admin scripts.

Repeat replacing the LIKE 'null' line in the first script with LIKE '' and then with LIKE '...'

Where no details can be found I tend to replace the ar, at and an fields with ? so that they don't come out in future searches (I'll go back and try to resolve them specifically once a month).

--------------------------------------
--------------------------------------

Now we'll do the same as above but for aircraft that already had a registration but no actual aircraft details. Run these scripts instead of those above:

INSERT INTO aircraft2 (MS,AR,AT,AN,AC,CN,PT,LK,PT2,LK2)
SELECT MS,AR,AT,AN,AC,CN,PT,LK,PT2,LK2
FROM aircraft
WHERE an LIKE 'null'

UPDATE aircraft2
SET at='B738', an='Boeing 737-86N', ac='Spicejet', cn='32672'
WHERE ar='VT-SPW'

Once complete apply the changes to the aircraft table and clean aircraft2 again then run again replacing the LIKE 'null' line in the first script with LIKE '' and then with LIKE '...'

-------------------------------------
-------------------------------------

Now we'll look at aircraft with no aircraft type code. Run this script to extract the first batch:

INSERT INTO aircraft2 (MS,AR,AT,AN,AC,CN,PT,LK,PT2,LK2)
SELECT MS,AR,AT,AN,AC,CN,PT,LK,PT2,LK2
FROM aircraft
WHERE at LIKE '...'

We'll want to replace these is an many batches as possible rather than one at a time so if for instance the first aircraft was a Piper PA-28-161 Cherokee Warrior II (an column) we might use this script:

UPDATE aircraft2
SET at = 'PA28'
WHERE an LIKE 'piper pa-28%'

The % is a wildcard so all aircraft who's names start Piper PA-28 in the list will be updated with a type code of PA28. Seaching is not case sensitive! You'll have to choose your own values for the last line of the script above depending on what suits. If your feeling really brave you could change that last line to read WHERE an LIKE '%pa-28%' which will cover every aircraft with pa-28 somewhere in the name text but I'd recommend not doing so.

Once complete (and applied to aircraft / cleaned aircraft2) try again replacing the ... in the first stage with [blank] and then 'null'.

------------------------------------------
------------------------------------------

To fix airline / owner names that start with Untitled (or any other text) we would run these scripts:

INSERT INTO aircraft2 (MS,AR,AT,AN,AC,CN,PT,LK,PT2,LK2)
SELECT MS,AR,AT,AN,AC,CN,PT,LK,PT2,LK2
FROM aircraft
WHERE ac LIKE 'Untitled%'

followed by

UPDATE aircraft2
SET mine = SUBSTR(ac,10,length(ac)-10), ac = mine
WHERE ac LIKE 'untitled(%'

which will change Untitled(United) to read United. If we had Untitled (United) with the space we'd use

UPDATE aircraft2
SET mine = SUBSTR(ac,11,length(ac)-11), ac = mine
WHERE ac LIKE 'untitled (%'

SUBSTR extracts a middle section from a piece of text:-
SUBSTR(column name,character number to start at,length of text to retain)

In the last example above the string Untitled ( is 10 characters so we want to start extracting from character 11. We dont know what length the text inside the parenthesis will be so to set the length of text to retain we use length(ac)-11 - the length of the total string in column ac minus 11 characters, 10 to cover Untitled ( plus 1 for the end ). Hope that's understandable! Don't forget also to chage the last line WHERE ac LIKE 'untitled (%' to ensure the correct records are identified.

If we had rubbish United Airlines rubbish our script would read:

UPDATE aircraft2
SET mine = SUBSTR(ac,9,length(ac)-16), ac = mine
WHERE ac LIKE 'rubbish%'

Care must be exercised if similar text needs to be edited on different records. I've used 2 scripts
WHERE ac LIKE 'untitled(%'
WHERE ac LIKE 'untitled (%'
because using untitled% would only work for some of the records dependng on which numbers we set in SUBSTR(ac,11,length(ac)-11) - all the records would be altered but one or the other set 'untitled(' or 'untitled (' would be incorrect.

Hopefully the above is clear. Just check your changes before you apply them to the aircraft table to be sure.

----------------------------------------
----------------------------------------

The last change we'd probably want to do is to standardise the airline name for all aircraft from a particular company. Most people have probably got aircraft from Easyjet listed as
Easyjet
Easyjet Airline
Easyjet Airlines
Easyjet Airline Company
Easyjet Switzerland
and possibly EasySwiss

To extract all of the these for changing we'd use:

INSERT INTO aircraft2 (MS,AR,AT,AN,AC,CN,PT,LK,PT2,LK2)
SELECT MS,AR,AT,AN,AC,CN,PT,LK,PT2,LK2
FROM aircraft
WHERE ac LIKE 'Easyjet%'

To edit just the UK part of the fleet we'd use

UPDATE aircraft2
SET ac = 'Easyjet'
WHERE ar LIKE 'G%' only aircraft with a reg starting G

and likewise the Swiss part of the fleet with

UPDATE aircraft2
SET ac = 'Easyjet Switzerland' or EasySwiss if you prefer
WHERE ar LIKE 'HB%'

They can then all be applied as one to the aircraft table.

--------------------------------------

That's it really. There are other changes but armed with the above you should be able to change any of the scripts or write your own to suit whatever changes you need to make. Other than the SUBSTR script they are all simple SQL (need to be for my beginner status). I'm sure someone will come along with much more powerful or simplified scripts that would save no end of time / effort but these will at least do the trick.
« Last Edit: August 13, 2008, 12:31:26 AM by testmonkey »
testmonkey
--------------------------------------------
RB - MVT9000 - UBC30XLT (for acars)

tarbat

  • ShipTrax Beta Testers
  • Hero Member
  • *
  • Posts: 4219
    • Radarbox at Easter Ross
Re: SQL help required
« Reply #9 on: August 13, 2008, 03:13:35 PM »
If your doing some serious SQL'ing, I can recommend SQLite Maestro as an excellent tool.  It's Visual Query Builder will write most of the SQL you'll need, and easily enables queries to be built that join tables to enable foreign key lookups, etc.

I've just used it to:
1. Take ModeS, Registration and ICAOType information from Aerodata's database of aircraft.
2. Matched that to all the aircraft currently in my NavData database.  Out of 81191 aircraft, 3510 needed their Registration (161) and/or ICAOType (3387) information updated.
3. Used the Export/Import facilities in SQLite Maestro to put that info. into my NavData database.

My priority was to get Registration and ICAOType corrected, since these fields drive the photo lookup and new silhouettes feature.  I checked a 10% sample of the Registrations, and ALL of them were incorrect or out-of-date in the NavData database - I checked on airframes.org.

testmonkey

  • Full Member
  • ***
  • Posts: 127
Re: SQL help required
« Reply #10 on: August 13, 2008, 06:29:54 PM »
Thanks tarbat - I'll take a look at Maestro as soon as I get time (really tied up tonight unfortunately). Need to teach myself some more SQL though to speed things up a little.

Pity airframes.org don't allow any bots. Would be good to go through the whole aircraft table automatically. I've already found quite a few of my aircraft on there that have only been delivered in the last couple of weeks.

Cheers
testmonkey
--------------------------------------------
RB - MVT9000 - UBC30XLT (for acars)