.NET
Java
Open Source
Mobile
Database
- SQL
- Microsoft Access
- MySQL
- Oracle
- SQL Server
Architecture
RIA & Web
Toolbox

Requires an Inbuilt function or any other solution

Last post was 07 Feb 2008 at 09:05

db , sql server Kākināda, India

- Reply
13 years ago

by angarika

angarika

, India

Joined 13 years ago
Hi experts,

I am facing one performance issue.This is the scenario.

For eg., EmpId,EmpName,DeptId,DeptName,Sal are there.Except sal all the remaining columns values are same(I mean having same data).

Assume,here 4 rows are there in that table.

so I wrote a select query to get the data like 1,xx,1,yy,50,500,5000,1000

here 50,500,5000,1000... are the sal column values.

I wrote a function to get the sal values as comma separated ones if the remaining column values are same.But for Huge no' of records (assume for 25000 records), the performance is very bad.

So need a inbuilt query / any other solution for this problem.

Report abuse
- Reply
13 years ago

by joe90

J Alan

, United Kingdom

Joined 15 years ago
I'm sure you already have this... the only way I think you can do this is with a function. I may of course be wrong, but here's what I'd do:
select EmpId, EmpName, DeptId, DeptName, dbo.fselSalList(EmpId, EmpName, DeptId, DeptName)
from Duplicates
group by EmpId, EmpName, DeptId, DeptName

create function fselSalList
(
    @empId int,
    @empName varchar(50),
    @deptId int,
    @deptName varchar(50)
)
returns varchar(200)
as
begin
    DECLARE @list varchar(200)

    SELECT @list = COALESCE(@list + ',', '') + Cast(sal As varchar(10))
    FROM Duplicates
    where EmpId = @empId and EmpName = @empName and DeptId = @deptId and DeptName = @deptName

    return @list
end
I'd also put an index on the EmpId, EmpName, DeptId, DeptName ... clustered if possible in order to help process the records. After I'd done that, 100% processing on the select statement appeared at the function call level (using query analyser). Sorry, but I can't test using your 25,000 records, so I have no idea whether this solution would be any better or worse than your current one. I'd certainly be interested to know one way or the other though.
Joe

Report abuse
- Reply
13 years ago

by angarika

angarika

, India

Joined 13 years ago
Yeah I did the same thing to solve my problem.But calling the same function repeatedly for 25000 records was reducing the performance.So I require some other solution to increase the performance.

Please help me in this regard.

Report abuse
- Reply
13 years ago

by joe90

J Alan

, United Kingdom

Joined 15 years ago
The only other thing that I can suggest is that you create another table in your database - kind of running the lines of data-warehousing now - where you have EmpId, EmpName, DeptId, DeptId and SalCSV. (De-normalising). You then initially load this table using the function you already have and then add triggers to the original table wherein you 'adjust' the warehouse table SalCSV with your function whenever a CRUD operation is made so that the new warehouse table always maintains the current "combination" data.
In this way you will have a single table that has all the information for you ready to be extracted - and performace will not be hampered by the amount of records you are extracting, since the data is already 'formatted' accordingly.
Joe

Report abuse
- Reply
13 years ago

by angarika

angarika

, India

Joined 13 years ago
Hi Joe,

Thanks for your reply.

But I am new to the concepts of data - warehousing. So If you don't mind could you please explain this in detail.

Report abuse
- Reply
13 years ago

by joe90

J Alan

, United Kingdom

Joined 15 years ago
Ok, generally speaking, relational databases are set up to optimise CRUD type operations (add, edit, delete). For this they tend to be normalized - usually sufficient to 3rd normal form. Some go to 5th, but that is perhaps overkill. Each DBA will have his own thoughts on this. The issue with this, is that when it comes time to extract data there can be quite a lot of interplay between the tables in order to "structure" the data in a format that is useful for reporting and/or data extraction. In other words it makes the extract process more complicated.
Data warehousing can be regarded as simply de-normalising the CRUD database into a form that is set up to optimise the extraction of data. In your case, instead of having to use an inline function in the normal extraction process from the CRUD database, the table already exists with the data in the format that you require in the data warehouse database.
Of course, you now have a dependency. The data warehouse database relies upon the CRUD database in order to function correctly, so some form of processing is required between the two of them. Now, depending upon the environment, this could be a batch job ran at specific intervals during the day (or night) or almost immediately (as in CRUD triggers). It depends upon factors such as: what performace hit is "accepable" or how "fresh" the extract data needs to be. What can be said is that almost inevitably, the data warehouse information always "lags" behind the CRUD database.
Then again, the original database can be a mix of CRUD type tables and data warehouse tables - but pragmatism will usually dictate which approach is taken.
I hope that this at least gives you a "flavour" of the differing concepts between the 2 types of databases described.
Joe

Report abuse
- Reply
13 years ago

by angarika

angarika

, India

Joined 13 years ago
Hi Joe,

Thank you so much.

If possible could you please give me an example to implement this.

Report abuse
- Reply
13 years ago

by joe90

J Alan

, United Kingdom

Joined 15 years ago
Here's one way in which it could be done:
USE [TestBed]
GO
/****** Object: Table [dbo].[Duplicates]    Script Date: 02/06/2008 10:22:18 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Duplicates](
    [EmpId] [int] NOT NULL,
    [EmpName] [nvarchar](50) COLLATE Latin1_General_CI_AS NOT NULL,
    [DeptId] [int] NOT NULL,
    [DeptName] [nvarchar](50) COLLATE Latin1_General_CI_AS NOT NULL,
    [Sal] [int] NOT NULL
) ON [PRIMARY]
GO
/****** Object: Table [dbo].[DuplicatesCSV]    Script Date: 02/06/2008 10:22:30 ******/
CREATE TABLE [dbo].[DuplicatesCSV](
    [EmpId] [int] NOT NULL,
    [EmpName] [nvarchar](50) COLLATE Latin1_General_CI_AS NOT NULL,
    [DeptId] [int] NOT NULL,
    [DeptName] [nvarchar](50) COLLATE Latin1_General_CI_AS NOT NULL,
    [SalCSV] [nvarchar](200) COLLATE Latin1_General_CI_AS NOT NULL
) ON [PRIMARY]
GO
USE [TestBed]
GO
/****** Object: Index [IX_Duplicates]    Script Date: 02/06/2008 10:26:18 ******/
CREATE CLUSTERED INDEX [IX_Duplicates] ON [dbo].[Duplicates]
(
    [EmpId] ASC,
    [DeptId] ASC,
    [DeptName] ASC,
    [EmpName] ASC
)WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]
GO
/****** Object: Index [IX_DuplicatesCSV]    Script Date: 02/06/2008 10:26:18 ******/
CREATE CLUSTERED INDEX [IX_DuplicatesCSV] ON [dbo].[DuplicatesCSV]
(
    [EmpId] ASC,
    [DeptId] ASC,
    [DeptName] ASC,
    [EmpName] ASC
)WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]
GO
/****** Object: UserDefinedFunction [dbo].[fselSalList]    Script Date: 02/06/2008 10:23:59 ******/
CREATE function [dbo].[fselSalList]
(
    @empId int,
    @empName varchar(50),
    @deptId int,
    @deptName varchar(50)
)
returns varchar(200)
as
begin
    declare @list varchar(200)

    select @list = COALESCE(@list + ',', '') + Cast(sal As varchar(10))
    from Duplicates
    where EmpId = @empId and EmpName = @empName and DeptId = @deptId and DeptName = @deptName

    return @list
end
GO
/*
-- to initially load duplicates tables (should Duplicates table already exist)
insert into DuplicatesCSV(EmpId, EmpName, DeptId, DeptName, SalCSV)
select distinct EmpId, EmpName, DeptId, DeptName, dbo.fselSalList(EmpId, EmpName, DeptId, DeptName)
from Duplicates
*/
GO
CREATE TRIGGER [iuDuplicates]
   ON [dbo].[Duplicates]
   FOR INSERT, UPDATE
AS
BEGIN
    set nocount on

    declare @empId int, @empName varchar(20), @deptId int, @DeptName varchar(20)

    select @empId = EmpId, @empName = EmpName, @deptId = DeptId, @deptName = DeptName
    from inserted

    delete from DuplicatesCSV
    where EmpId = @empId
    and EmpName = @empName
    and DeptId = @deptId
    and DeptName = @deptName

    insert into DuplicatesCSV(EmpId, EmpName, DeptId, DeptName, SalCSV)
    select distinct EmpId, EmpName, DeptId, DeptName, dbo.fselSalList(EmpId, EmpName, DeptId, DeptName)
    from inserted
END
GO
/****** Object: Trigger [dDuplicates]    Script Date: 02/06/2008 10:27:36 ******/
CREATE TRIGGER [dDuplicates]
   ON [dbo].[Duplicates]
   FOR DELETE
AS
BEGIN
    set nocount on

    declare @empId int, @empName varchar(20), @deptId int, @DeptName varchar(20)

    select @empId = EmpId, @empName = EmpName, @deptId = DeptId, @deptName = DeptName
    from deleted

    delete from DuplicatesCSV
    where EmpId = @empId
    and EmpName = @empName
    and DeptId = @deptId
    and DeptName = @deptName

    if exists
    (
        select EmpId
        from Duplicates
        where EmpId = @empId
        and EmpName = @empName
        and DeptId = @deptId
        and DeptName = @deptName
    )
    begin
        insert into DuplicatesCSV(EmpId, EmpName, DeptId, DeptName, SalCSV)
        select distinct EmpId, EmpName, DeptId, DeptName, dbo.fselSalList(EmpId, EmpName, DeptId, DeptName)
        from deleted
    end
END
I'm sure that there are others... but this works on a small dataset .. not sure how 'efficient' it would be for 25,000 records. Try it out on a test database 'copy'.
Joe

Report abuse
- Reply
13 years ago

by angarika

angarika

, India

Joined 13 years ago
Hi Joe,

Thank you so much for your replies.

I'll try this.

Report abuse
- Reply
13 years ago

by joe90

J Alan

, United Kingdom

Joined 15 years ago
No problem .. hope you get it sorted.
Joe

Report abuse

Post a reply

Enter your message below

Sign in or Join us (it's free).

SQL Server forum discussion

problem with select within select

by hamidreza.hr923 (2 replies)
convert EMPTY fields to NULL?

by viktorcheberachko (5 replies)
how to delete values of a particular column in a table

by sanjaymishra112sm112 (7 replies)
Send mail from stored Procedure.

by Mulish Mehdi (3 replies)
how to update multiple entries

by khari6579 (1 replies)

SQL Server tutorials

Quick links

Recent activity

arif ahmad replied to How to receive data in web ...
William Thompson replied to What is the name of the Win...
Sameera Piyadigamage replied to Point of Sale Developers: H...
Scott Carline replied to 4 x C# Developers for large...
Rajendra Dhakal replied to Restore SQL Server text dat...
cloud rainda replied to How to convert between TS f...

Managed hosting by Everycity

Requires an Inbuilt function or any other solution

angarika

J Alan

angarika

J Alan

angarika

J Alan

angarika

J Alan

angarika

J Alan

Post a reply

Enter your message below

SQL Server forum discussion

problem with select within select

by hamidreza.hr923 (2 replies)

convert EMPTY fields to NULL?

by viktorcheberachko (5 replies)

how to delete values of a particular column in a table

by sanjaymishra112sm112 (7 replies)

Send mail from stored Procedure.

by Mulish Mehdi (3 replies)

how to update multiple entries

by khari6579 (1 replies)

SQL Server tutorials

Quick links

Recent activity

Contribute

Web Development

Developer Jobs

Our tools