(B) What is normalization? What are different type of
normalization?
It is set of rules that has been established to aid in the design of tables that are meant to
be connected through relationships. This set of rules is known as Normalization.
Benefits of Normalizing your database include:
√ Avoiding repetitive entries
√ Reducing required storage space
√ Preventing the need to restructure existing tables to accommodate new data.
√ Increased speed and flexibility of queries, sorts, and summaries.
Following are the three normal forms :- First Normal Form:
For a table to be in first normal form, data must be broken up into the smallest units
possible.In addition to breaking data up into the smallest meaningful values, tables in
first normal form should not contain repetitions groups of fields. Second Normal form:
The second normal form states that each field in a multiple field primary keytable must
be directly related to the entire primary key. Or in other words,each non-key field should
be a fact about all the fields in the primary key. Third normal form:
A non-key field should not depend on other Non-key field.The field "Total" is dependent
on "Unit price" and "qty".
(B)What is denormalization ?
Denormalization is the process of putting one fact in numerous places (its vice-versa of
normalization).Only one valid reason exists for denormalizing a relational design - to
enhance performance.The sacrifice to performance is that you increase redundancy in
database.
(B)What is a candidate key ?
A table may have more than one combination of columns that could uniquely identify the
rows in a table; each combination is a candidate key. During database design you can pick
up one of the candidate keys to be the primary key. For example, in the supplier table
supplierid and suppliername can be candidate key but you will only pick up supplierid as
the primary key.
(B) What are different types of joins and whats the
difference between them ? INNER JOIN
Inner join shows matches only when they exist in both tables.Example in the below SQL
there are two tables Customers and Orders and the inner join in made on Customers
Customerid and Orders Customerid. So this SQL will only give you result with customers
who have orders.If the customer does not have order it will not display that record.
SELECT Customers.*, Orders.* FROM Customers INNER JOIN Orders ON
Customers.CustomerID =Orders.CustomerID
LEFT OUTER JOIN
Left join will display all records in left table of the SQL statement.In SQL below customers
with or without orders will be displayed. Order data for customers without orders appears
as NULL values. For example, you want to determine the amount ordered by each
customer and you need to see who has not ordered anything as well. You can also see the
LEFT OUTER JOIN as a mirror image of the RIGHT OUTER JOIN (Is covered in the
next section) if you switch the side of each table.
SELECT Customers.*, Orders.* FROM Customers LEFT OUTER JOIN Orders ON
Customers.CustomerID =Orders.CustomerID
RIGHT OUTER JOIN
Right join will display all records in right table of the SQL statement.In SQL below all
orders with or without matching customer records will be displayed. Customer data for
orders without customers appears as NULL values. For example, you want to determine
if there are any orders in the data with undefined CustomerID values (say, after a conversion
or something like it). You can also see the RIGHT OUTER JOIN as a mirror image of
the LEFT OUTER JOIN if you switch the side of each table.
SELECT Customers.*, Orders.* FROM Customers RIGHT OUTER JOIN Orders
ON Customers.CustomerID =Orders.CustomerID
(I)What are indexes and What is the difference between
clustered and nonclustered indexes?
Indexes in SQL Server are similar to the indexes in books. They help SQL Server retrieve
the data quicker
There are clustered and nonclustered indexes. A clustered index is a special type of index
that reorders the way records in the table are physically stored. Therefore table can have
only one clustered index. The leaf nodes of a clustered index contain the data pages.
A nonclustered index is a special type of index in which the logical order of the index
does not match the physical stored order of the rows on disk. The leaf node of a
nonclustered index does not consist of the data pages. Instead, the leaf nodes contain
index rows.
(A)What is the use of OLAP & What's a measure in OLAP?
OLAP is useful because it provides fast and interactive access to aggregated data and the
ability to drill down to detail.
Measures are the key performance indicators that you want to evaluate. To determine
which of the numbers in the data might be measures, a rule of thumb is: If a number
makes sense when it is aggregated, then it is a measure.
(A)What is RAID and how does it work ?
Redundant Array of Independent Disks (RAID) is a term used to describe the technique
of improving data availability through the use of arrays of disks and various data-striping
methodologies. Disk arrays are groups of disk drives that work together to achieve higher
data-transfer and I/O rates than those provided by single large drives. An array is a set of
multiple disk drives plus a specialized controller (an array controller) that keeps track of
how data is distributed across the drives. Data for a particular file is written in segments
to the different drives in the array rather than being written to a single drive.
For speed and reliability, it's better to have more disks. When these disks are arranged in
certain patterns and use a specific controller, they are called a Redundant Array of
Inexpensive Disks (RAID) set. There are several numbers associated with RAID, but the
most common are 1, 5 and 10.
RAID 1 works by duplicating the same writes on two hard drives. Let's assume you have
two 20 Gigabyte drives. In RAID 1, data is written at the same time to both drives.
RAID1 is optimized for fast writes.
RAID 5 works by writing parts of data across all drives in the set (it requires at least three
drives). If a drive failed, the entire set would be worthless. To combat this problem, one
of the drives stores a "parity" bit. Think of a math problem, such as 3 + 7 = 10. You can
think of the drives as storing one of the numbers, and the 10 is the parity part. By removing
any one of the numbers, you can get it back by referring to the other two, like this: 3 + X
= 10. Of course, losing more than one could be evil. RAID 5 is optimized for reads.
RAID 10 is a bit of a combination of both types. It doesn't store a parity bit, so it's fast,
but it duplicates the data on two drives to be safe. You need at least four drives for RAID
10. This type of RAID is probably the best compromise for a database server.
(B)What's the difference between DELETE TABLE and
TRUNCATE TABLE commands?
Following are difference between them :-
√ DELETE TABLE syntax logs the deletes thus making the delete operation
slow.TRUNCATE table does not log any information but it logs information
about deallocation of data page of the table.So TRUNCATE table is faster as
compared to delete table.
√ DELETE table can be rolled back while TRUNCATE can not be.
√ DELETE table can have criteria while TRUNCATE can not.
√ TRUNCATE table can not triggers.
(B)What are the problems that can occur if you do not
implement locking properly in SQL SERVER ?
Following are the problems that occur if you do not implement locking properly in SQL
SERVER. Lost Updates
Lost updates occur if you let two transactions modify the same data at the same time, and
the transaction that completes first is lost. You need to watch out for lost updates with
the READ UNCOMMITTED isolation level. This isolation level disregards any type of
locks, so two simultaneous data modifications are not aware of each other. Suppose that
a customer has due of 2000$ to be paid.He pays 1000$ and again buys a product of
500$.Lets say that these two transactions are now been entered from two different counters
of the company.Now both the counter user start making entry at the same time 10:00
AM.Actually speaking at 10:01 AM the customer should have 2000$-1000$+500 = 1500$
pending to be paid.But as said in lost updates the first transaction is not considered and
the second transaction overrides it.So the final pending is 2000$+500$ = 2500$.....I hope
the company does not loose the customer. Non-Repeatable Read
Non-repeatable reads occur if a transaction is able to read the same row multiple times
and gets a different value each time.Again, this problem is most likely to occur with the
READ UNCOMMITTED isolation level. Because you let two transactions modify data
at the same time, you can get some unexpected results. For instance, a customer wants to
book flight , so the travel agent checks for the flights availability.Travel agent finds a seat
and goes ahead to book the seat.While the travel agent is booking the seat , some other
travel agent books the seat.When this travel agent goes to update the record , he gets
error saying that “Seat is already booked”.In short the travel agent gets different status at
different times for the seat. Dirty Reads
Dirty reads are a special case of non-repeatable read. This happens if you run a report
while transactions are modifying the data that you're reporting on. For example there is a
customer invoice report which runs on 1:00 AM in afternoon and after that all invoices
are sent to the respective customer for payments.Lets say one of the customer has 1000$
to be paid.Customer pays 1000$ at 1:00 AM and at the same time report is run. Actually
customer has no money pending but is still issued a invoice. Phantom Reads
Phantom reads occur due to a transaction being able to read a row on the first read, but
not being able to modify the same row due to another transaction deleting rows from the
same table. Lets say you edit a record in the mean time somebody comes and deletes the
record , you then go for updating the record which does not exist....Panic.
Interestingly, the phantom reads can occur even with the default isolation level supported
by SQL Server: READ COMMITTED. The only isolation level that doesn't allow
phantoms is SERIALIZABLE, which ensures that each transaction is completely isolated
from others. In other words, no one can acquire any type of locks on the affected row
while it is being modified.
(B)What are different transaction levels in SQL SERVER
? Twist :- what are different types of locks in SQL SERVER ?
Transaction Isolation level decides how is one process isolated from other process. Using
transaction levels you can implement locking in SQL SERVER.
There are four transaction levels in SQL SERVER :- READ COMMITTED
The shared lock is held for the duration of the transaction, meaning that no other
transactions can change the data at the same time. Other transactions can insert and
modify data in the same table, however, as long as it is not locked by the first transaction. READ UNCOMMITTED
No shared locks and no exclusive locks are honored. This is the least restrictive isolation
level resulting in the best concurrency but the least data integrity. REPEATABLE READ
This setting disallows dirty and non-repeatable reads. However, even though the locks
are held on read data, new rows can still be inserted in the table, and will subsequently be
read by the transaction. SERIALIZABLE
This is the most restrictive setting holding shared locks on the range of data. This setting
does not allow the insertion of new rows in the range that is locked; therefore, no phantoms
are allowed.
Following is the syntax for setting transaction level in SQL SERVER.
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
(I)What are different locks in SQL SERVER ?
Depending on the transaction level six types of lock can be acquired on data :- Intent
The intent lock shows the future intention of SQL Server's lock manager to acquire locks
on a specific unit of data for a particular transaction. SQL Server uses intent locks to
queue exclusive locks, thereby ensuring that these locks will be placed on the data elements
in the order the transactions were initiated. Intent locks come in three flavors: intent
shared (IS), intent exclusive (IX), and shared with intent exclusive (SIX).
IS locks indicate that the transaction will read some (but not all) the resources in the table
or page by placing shared locks.
IX locks indicate that the transaction will modify some (but not all) resources in the table
or page by placing exclusive locks.
SIX locks indicate that the transaction will read all resources, and modify some(but not
all) of them. This will be accomplished by placing the shared locks on the resources read
and exclusive locks on the rows modified. Only one SIX lock is allowed per resource at
one time; therefore, SIX locks prevent other connections from modifying any data in the
resource (page or table), although they do allow reading the data in the same resource.
Shared
Shared locks (S) allow transactions to read data with SELECT statements. Other
connections are allowed to read the data at the same time; however, no transactions are
allowed to modify data until the shared locks are released. Update
Update locks (U) are acquired just prior to modifying the data. If a transaction modifies a
row, then the update lock is escalated to an exclusive lock; otherwise, it is converted to a
shared lock. Only one transaction can acquire update locks to a resource at one time.
Using update locks prevents multiple connections from having a shared lock that want to
eventually modify a resource using an exclusive lock. Shared locks are compatible with
other shared locks, but are not compatible with Update locks. Exclusive
Exclusive locks (X) completely lock the resource from any type of access including reads.
They are issued when data is being modified through INSERT, UPDATE and DELETE
statements. Schema
Schema modification locks (Sch-M) are acquired when data definition language statements,
such as CREATE TABLE, CREATE INDEX, ALTER TABLE, and so on are being
executed. Schema stability locks (Sch-S) are acquired when store procedures are being
compiled. Bulk Update
Bulk update locks (BU) are used when performing a bulk-copy of data into a table with
TABLOCK hint. These locks improve performance while bulk copying data into a table;
however, they reduce concurrency by effectively disabling any other connections to read
or modify data in the table.
(I)What is LOCK escalation?
Lock escalation is the process of converting a lot of low level locks (like row locks, page
locks) into higher level locks (like table locks). Every lock is a memory structure too
many locks would mean, more memory being occupied by locks. To prevent this from
happening, SQL Server escalates the many fine-grain locks to fewer coarse-grain locks.
Lock escalation threshold was definable in SQL Server 6.5, but from SQL Server 7.0
onwards it's dynamically managed by SQL Server.
(B)What are the different ways of moving data/ databases between servers and databases in SQL
Server?
There are lots of options available, you have to choose your option depending upon your
requirements. Some of the options you have are: BACKUP/RESTORE, detaching and
attaching databases, replication, DTS, BCP, logshipping, INSERT...SELECT,
SELECT...INTO, creating INSERT scripts to generate data.
(B)What is the difference between a HAVING CLAUSE
and a WHERE CLAUSE?
You can use Having Clause with the GROUP BY function in a query and WHERE
Clause is applied to each row before they are part of the GROUP BY function in a query.
(B) What is difference between UNION and UNION ALL
SQL syntax ?
UNION SQL syntax is used to select information from two tables.But it selects only
distinct records from both the table. , while UNION ALL selects all records from both
the tables.
(I) What is ACID fundamental and what are transactions
in SQL SERVER ?
A transaction is a sequence of operations performed as a single logical unit of work. A
logical unit of work must exhibit four properties, called the ACID (Atomicity, Consistency,
Isolation, and Durability) properties, to qualify as a transaction: Atomicity
√ A transaction must be an atomic unit of work; either all of its data
modifications are performed or none of them is performed. Consistency
√ When completed, a transaction must leave all data in a consistent
state. In a relational database, all rules must be applied to the transaction's
modifications to maintain all data integrity. All internal data structures, such
as B-tree indexes or doubly-linked lists, must be correct at the end of the
transaction. Isolation
√ Modifications made by concurrent transactions must be isolated
from the modifications made by any other concurrent transactions. A
transaction either sees data in the state it was in before another concurrent
transaction modified it, or it sees the data after the second transaction has
completed, but it does not see an intermediate state. This is referred to as
serializability because it results in the ability to reload the starting data and
replay a series of transactions to end up with the data in the same state it was
in after the original transactions were performed. Durability
√ After a transaction has completed, its effects are permanently in
place in the system. The modifications persist even in the event of a system
failure.
(A) What is purpose of Replication ?
Replication is way of keeping data synchronized in multiple databases. SQL server
replication has two important aspects publisher and subscriber.
Publisher
Database server that makes data available for replication is called as Publisher.
Subscriber
Database Servers that get data from the publishers is called as Subscribers.
(I) What is BCP utility in SQL SERVER ?
BCP (Bulk Copy Program) is a command line utility by which you can import and export
large amounts of data in and out of SQL SERVER database.
(I)What are different types of triggers in SQl SERVER
2000 ?
There are two types of triggers :- √ INSTEAD OF triggers
INSTEAD OF triggers fire in place of the triggering action. For example, if an INSTEAD
OF UPDATE trigger exists on the Sales table and an UPDATE statement is executed
against the Salestable, the UPDATE statement will not change a row in the sales table.
Instead, the UPDATE statement causes the INSTEAD OF UPDATE trigger to be
executed, which may or may not modify data in the Sales table. √ AFTER triggers
AFTER triggers execute following the SQL action, such as an insert, update, or delete.This
is the traditional trigger which existed in SQL SERVER.
INSTEAD OF triggers gets executed automatically before the Primary Key and the Foreign
Key constraints are checked, whereas the traditional AFTER triggers gets executed after
these constraints are checked.
Unlike AFTER triggers, INSTEAD OF triggers can be created on views.
(A)If we have multiple AFTER Triggers on table how can
we define the sequence of the triggers ?
If a table has multiple AFTER triggers, then you can specify which trigger should be
executed first and which trigger should be executed last using the stored procedure
sp_settriggerorder. All the other triggers are in an undefined order which you cannot
control.
(A)What is SQl injection ?
It is a Form of attack on a database-driven Web site in which the attacker executes
unauthorized SQL commands by taking advantage of insecure code on a system connected
to the Internet, bypassing the firewall. SQL injection attacks are used to steal information
from a database from which the data would normally not be available and/or to gain
access to an organization’s host computers through the computer that is hosting the
database.
SQL injection attacks typically are easy to avoid by ensuring that a system has strong
input validation.
As name suggest we inject SQL which can be relatively dangerous for the database.
Example this is a simple SQL
SELECT email, passwd, login_id, full_name
FROM members
WHERE email = 'x'
Now somebody does not put “x” as the input but puts “x ; DROP TABLE members;”.
So the actual SQL which will execute is :-
SELECT email, passwd, login_id, full_name
FROM members
WHERE email = 'x' ; DROP TABLE members;