File organization • ChrisLi-Tech

Def: Refers to the data is stored in a file

Text file VS binary form

The number of data items per line must be known and the number of characters per item must be known.

The records are put into the file chronologically, ie., according to when each record is produced. The one produced first is put into the file first, then the next one
Each new record is simply appended to the file so that the only
ordering in the file is the time order of data entry.
Typical example:
For a bank to record transactions involving customer accounts

A sequential file have records that are ordered!
In order to allow a sequential file to be ordered, there has to be a key for which the values are unique and sequential but not necessarily consecutive.
When a new record is to be added what need to do?
Read the file sequentially and each record is written to a new file.
This is continued until the appropriated position for the new record is
reached.
The new record is then written to the new file before the remain
records in the old file are copied in.

Random-access files
Access can be to any record in the file without sequential reading of the file
Direct access can be achieved with a sequential file.
A separated index file is created which has two fields per record
The first field has the key field and the second field has a value for the position of this key field value in the main file.
The alternative is to use a hashing algorithm
What is hashing algorithm?

If there is a numeric key field in each record

Choose a suitable number
Divide this number by the value in the key field
The remainder from this division then identifies the address in the file for storage of that record
The suitable number works best if it is a prime number of a similar size to the expected size of the file
collision: when the same address is calculated for different field values, it is usually referred to as collision.
Ways to handle collisions
-Use a sequential search to look for a vacant address following the calculated one.
-Keep a number of overflow addresses at the end of the file.
-Have a linked list accessible from each address