|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The smallest data item that computers support is called a bit, short for "binary digit." It can assume one of two values, 0, or off, and 1, or on. Ultimately, all data items processed by a computer are reduced to combinations of zeros and ones. Eight bits make a byte or character, so called as every character typed on the keyboard is stored as a string of eight bits using ASCII (American Standard Code for Information Interchange) or using Unicode A Character is letter, digit, or special character (such as $, ?, *, or even a blank space)
A set of related characters is called a field. They convey a meaning and includes IDnumber field, Name field, Address field etc. They are entered in the design view and form headings for columns in the completed data base. One field is designated as the Primary key. This value cannot be duplicated and it cannot be left blank. They distinguish rows from one another.
A collection of related field is called a record. It is sometimes known as a tupple. They are entered in the data sheet view. They form the rows of the database. They include the individuals IDnumber, name, address, etc.
A collection of related records is a file. The records of all individuals in a group are collected in a file.
A collection of interrelated files stored together with minimum redundancy is called a database. A database is an organized collection of logically related data. Data is structured to be easily stored, manipulated, and retrieved by users. A Database Management System (DBMS) is a software package that enables you to create a database, enter data into the database, modify the data as required, and retrieve information from the database. Database approach emphasizes the integration and sharing of data throughout the organization.
Data refers to known facts that can be recorded and stored on computer media. Information is data that has been processed to increase the knowledge of the user. It is easy to use. For example, when students register, the registration form is data. It is processed to be sorted by course, all students taking the same course are then listed in alphabetic order. This is information as it is easy for the instructor to take attendance, assign grades, etc. Database can refer to both data and information.
Components of the Database Environment.
1. Computer-aided software engineering (CASE) tools. Automated tools used to design databases and application programs
2. Repository. Centralized storehouse for all data definitions, data relationships, screen and report formats.
3. Database Management System (DBMS). Commercial software system used to define, create, maintain, and provide controlled access to database and repository.
4. Database. An organized collection of related data, to meet the information needs of multiple users.
5. Applications programs. Computer programs used to create and maintain database and provide information to users.
6. User Interface. Languages, menus, and other facilities by which users interact with the system components
7. Data Administrators. Persons responsible for the overall information resources.
8. System Developers. System Analysts and Programmers who design new application programs
9. End Users. Persons who add, delete, and modify data in the database and who request and receive information from it.
DBMS were first introduced during the 1960s and were used for large and complex ventures such as the Apollo moon-landing project. Hierarchical and network models were introduced in the 1970s for complex data structures such as manufacturing bills of materials.. The relational model was first defined by E. F. Codd an IBM research fellow in 1970, and became commercially successful in the 1980s. Client/server computing and Internet applications became important in the 1990s. Multimedia data, including graphics, sound, images, and video also became more common, and so object-oriented databases were introduced. Combination of relational and object-oriented databases known as object-relational databases are now available. In the future multidimensional data will become more important. Other future developments include universal servers, fully distributed databases to physically distribute databases to multiple locations and update them automatically, content-addressable storage where users can retrieve data by what they desire rather the how to retrieve them, and finally, artificial intelligence and television-like information services will enable users to request data in a more natural language with database technology anticipating users' data needs based on past queries and relevant database changes.
Metadata. Data is useful when placed in some context. Metadata are data that describe the properties or characteristics, such as definitions, structures, and rules or constraints, of other data.
|
|
|
| Name | Type |
|
|
|
Description |
| Course | Alphanumeric |
|
Course ID and Number | ||
| Section | Integer |
|
|
|
Section Number |
| Semester | Alphanumeric |
|
Semester and Year | ||
| Name | Alphanumeric |
|
Student Name | ||
| ID | Integer |
|
Student ID (SSN) | ||
| Major | Alphanumeric |
|
Student Major | ||
| GPA | Decimal |
|
|
|
Student Grade Point Average |
File Processing Systems. In the beginning
of computer-based data processing, there were no data bases. To be
useful for business applications, computers must be able to store, manipulate,
and retrieve large files of data. Computer file processing systems
were developed for this purpose. As business applications became
more complex, the shortcomings and limitations of file processing systems
became evident. So, these systems have been replaced by database
processing systems. File processing systems are now used as backups
for database systems. Disadvantages of File Processing Systems include:
1. Program-Data Dependence. File descriptions
are stored within each application program that accesses a given file.
2. Duplication of Data. Applications
are developed independently in file processing systems leading to unplanned
duplicate files. Duplication is wasteful as it requires additional
storage space and changes in one file must be made manually in all files.
This also results in loss of data integrity. It is also possible
that the same data item may have different names in different files, or
the same name may be used for different data items in different files.
3. Limited data sharing. Each application
has its own private files with little opportunity to share data outside
their own applications. A requested report may require data from
several incompatible files in separate systems.
4. Lengthy Development Times. There
is little opportunity to leverage previous development efforts. Each
new application requires the developer to start from scratch by designing
new file formats and descriptions
5. Excessive Program Maintenance. The
preceding factors create a heavy program maintenance load.
The way in which a database organizes data depends on the type, or model, of the database. The four main models are: hierarchical, network, relational, and object-oriented. Each type structures, organizes, and uses data differently. Hierarchical and Network models are efficient, but are not flexible, are complex, and require more memory. They are used on main frames and in large organizations. The lower memory requirement, flexibility and simplicity of relational models make them the model of choice for personal computer users. A relational database organizes data in a table format consisting of rows and columns. This is the model we will be using. The object-oriented model is a more recent development. It was developed to manipulate complex data types of data such as graphics, video, audio, X-rays, MRI scans, ultrasound images, electrocardiograms, Geographical Information Systems (GIS) working with maps, and educational instruction systems. Combination of relational and object-oriented databases known as object-relational databases.
Advantages of the Database Approach
Program-Data Independence. The separation of data description (metadata) from the application programs that use the data leads to data independence. Data descriptions are stored in a central location called the repository. Organization's data can change and evolve without changing the application programs that process the data.
Minimal Data Redundancy. Traditionally, information systems were developed using a file-processing approach. Each application had its own files, and data was not shared among applications, resulting in a great deal of data redundancy, or repetition of the same data value. The database approach was developed to minimize data redundancy by creating separate files for each entity. Files are referred to as tables, and a database is a collection of related tables. Data files are integrated into a single logical structure. While not completely eliminating redundancy, the designer can control the type and amount of redundancy.
Improved Data Consistency. Obtained by reducing redundancy. Updating data values is simplified as each value is stored in one place only. Storage is not wasted.
Improved Data Sharing. Database is designed as a shared resource. Authorized users are granted permission to use the database, and provided with user views to facilitate this use.
Improved Productivity of Application Development. Reduction of the cost and time for developing new business applications. The programmer can concentrate on the specific functions required for the new application and DBMS provides a number of high-level productivity tools such as forms and report generators and high-level languages that automate some of the activities of database design and implementation.
Enforcement of Standards. Standards include naming conventions, data quality standards, and uniform procedures for accessing, updating, and protecting data.
Improved Data Accessibility and Responsiveness. End users without programming experience can retrieve and display data (using SQL)
Reduced Program Maintenance. Data are independent of the application programs that use them, and either one can be changed without a change in the other.
Data Integrity. The term data integrity refers to the degree to which data is accurate and reliable. Integrity Constraints are rules that all data must follow. For example if a field is a month, then a number greater than 12 is invalid. Similar examples are number of days in a month, number of hours in a day, etc. Other invalid values could be pay rates, temperatures (too high or too low) etc.
New, Specialized Personnel. New individuals need to be hired and/or trained, and frequently retrained or upgraded to implement databases.
Installation and Management Cost and Complexity. A multi-user DBMS is a large and complex suite of software that has a high initial cost, require a staff of trained personnel to install and operate, and a substantial annual maintenance and support costs. Hardware and Data Communications systems may need upgrading. Security software is often required to ensure proper concurrent of shared data.
Conversion Costs. Older file processing system converted to modern database technology will cost money and time.
Need for Explicit Backup and Recovery. Comprehensive procedures must be developed and used for providing backup copies of data and for restoring damaged database.
Organizational Conflict. Conflicts on data definitions, data formats and coding, rights to update shared data, and associated issues are difficult to resolve.
Data Dictionary. Each database has a data dictionary (or catalog) that stores data about the tables and fields within the database. For each table, the data dictionary contains the table names and any relationship with other tables. For each field, the data dictionary records the field name, data type (text, numeric, date, etc.). field size, and validation rules to enforce integrity constraints. Any attempt to enter invalid data, results in an error message to the user.
Data Maintenance. Data maintenance consists of three basic operations: adding new data, modifying existing data, and deleting data. In the PC-based databases, the user interacts directly with the DBMS to perform the maintenance tasks.
Data Retrieval. Data retrieval involves extracting the desired data from the database, using queries or reports. With a query, the user presents a set of criteria that the DBMS uses to select data from the database. A query language enables the user to prepare the query using English-like statements. Each DBMS may have its own query language, but all support Structured Query Language (SQL), a standardized language that was developed specifically to write database queries. SQL commands may be entered directly by the user or may be included in programs written in programming languages. Another method for developing queries is Query-by-example (QBE), which uses graphical interchange to specify the criteria for selecting records. There is no standard QBE format, each DBMS having its own QBE format. Queries generally select a relatively small portion of the database and present the data in a standard format displayed on the monitor. A report provides a formatted presentation of data from the database. Reports show larger amounts of data, allowing the user to format the data in any manner. They are normally printed. Reports are designed using a report generator built into the DBMS.
Concurrency Control. Databases allow concurrent access by many users. A record locking scheme is used to prevent several users attempting to update the same record at the same time. When the first user accesses a record for update, the DBMS locks out any further attempt at updating that record until the first update is complete.
Security. In addition to User ID and password, specific privileges can be assigned to each user, defining that user's access to the data. Read-only privilege permits that user only to look at the data; no changes are allowed. Update privilege allows the user to make changes to the data. DBMS has privileges at the field level; a user may be able to change some fields, just look at others, and not even see some fields.
Backup and Recovery. Data may be damaged or destroyed, due to hardware failure, physical damage caused by fires or floods, and software or human errors. A backup, or copy, must be made periodically. DBMSs include backup routines or rely on system utilities. Recovery is replacing the damaged database with good backup. Users have to renter data of any transactions lost since the last backup.
A Business example using the Databased Approach.
The first step is to identify the entities. An entity is
an object or concept that is important to the business, such as CUSTOMER,
PRODUCT, EMPLOYEE, CUSTOMER ORDER, and DEPARTMENT. (Use Upper Case Letters
for Entities) Then prepare a graphical model that shows the
associations among these entities. This is referred as an entity-relationship
diagram.
The internet ties the "information world" together. The World Wide Web makes the internet easy to use and gives it the flair and sizzle of multimedia. The internet exploded into the public consciousness in the mid-1990's. Its growth rate has been phenomenal. The number of host computers grew from 43.2 million in 1999 to 72.4 million in 2000, a 68 percent increase in one year! The number of individuals Using the internet worldwide is estimated to be 333 million.
During the mid 1950s, when the then Soviet Union exploded the atom bomb,
there was fear that the computing facilities, which were then very small
and concentrated in a few locations could be attacked and destroyed.
They were scattered over a number of areas and a method of communication
between them was devised. Messages were coded and sent in packets.
This technology was developed by the department of defense with the cooperation
of a few academics. The idea was similar in a way to what was done
in the 1850's sending gold shipments east from California to avoid being
held up. The shipments were divided into many packets and sent by
different routes so that any hold up still enabled the rest to reach the
east coast. In the same way, any decoded messages intercepted were
only a small portion of the total messages sent.
Tim Berners Lee
| Tim Berners Lee graduate of Oxford University, England, Tim is now
with the Laboratory for Computer Science ( LCS)
at the Massachusetts Institute of Technology (
MIT).
He directs the
W3
Consortium, an open forum of companies and organizations with the mission
to realize the full potential of the Web.
Tim Berners-Lee |
![]() |
In 1990, Dr. Berners-Lee, a physicist at a laboratory for particle physics in Geneva, Switzerland, perceived that his work would be easier if he and his far-flung colleagues could easily link to one another's computer. He saw the set of links from computer to computer to computer as a spider's web; hence the name web. The CERN site, the particle physics laboratory where Dr. Berners-Lee worked, is considered the birthplace of the World Wide Web.
| Marc Andreessen was a young undergraduate at
the University of Illinois at the time of the invention of the World Wide
Web. He began work in the early 1990's at the National Center for
Supercomputing Applications (NCSA), which was a federally funded program.
Andreessen began development of his Mosaic Web browser in 1993, which would
revolutionize the Internet in the coming years. Being funded by the
government, NCSA was free to explore experimental technologies, and did
so extensively with the Internet. Andreessen was interested in combining
the existing Internet framework with the multimedia applications made available
by hypertext and the World Wide Web
Marc Andreessen |
![]() |
Marc Andreessen was a college student when in 1993, he led a
team that invented the first graphical browser, named mosaic, which
featured a graphical interface so that users could see and click on pictures
as well as text. For the viewing public, the Internet now offered
both easy movement with Dr. Berners-Lee's links and attractive images and
a graphical interface provided by mosaic.