Wednesday, March 18, 2015

Creating a constraint to ensure intersection table only joins certain records

I have a database with the following tables

I need to ensure that users only becomes friends with other users of the same department.

Approach 1 – DepartmentId on Friend

On http://stackoverflow.com/a/20443626/263003, diarmuid suggested to add the DepartmentId on the Friend table and use the foreign key constraints of (MyselfId, AppId) and (FriendId, AppId). However, I would have to set the primary key of the User table as (Id, DepartmentId) – this creates a super key and I want to avoid that.

Approach 2 – CHECK CONSTRAINT on VIEW

The simplest approach I could think of is to create a VIEW joining User to Friend to User, and add a CHECK CONSTRAINT on the view. However, CHECK CONSTRAINT only applies on TABLE and not VIEW.

Approach 3 – CROSS JOIN and UNIQUE INDEX on VIEW

On http://stackoverflow.com/a/9751172/263003, Martin Smith uses the approach of selecting the records we do not want, creating duplicate records of those using CROSS JOIN, and ensuring those records cannot exists by using a UNIQUE INDEX. Let’s start by creating the INDEXED VIEW

CREATE VIEW [dbo].[CheckFriendView]
WITH SCHEMABINDING
AS
SELECT myself.Id AS MyselfId, myFriend.Id AS MyFriendId, myself.DepartmentId AS MyDepartmentId, myFriend.DepartmentId AS FriendDepartmentId
FROM  dbo.[User] AS myself
INNER JOIN  dbo.Friend
ON dbo.Friend.MyselfId = myself.Id
INNER JOIN dbo.[User] AS myFriend
ON dbo.Friend.FriendId = myFriend.Id
WHERE myself.DepartmentId <> myFriend.DepartmentId

GO

CREATE UNIQUE CLUSTERED INDEX IDX_CheckFriendView
   ON [dbo].[CheckFriendView] (MyselfId, MyFriendId)

GO

However, I got the error

Cannot create index on view "Demo.dbo.CheckFriendView". The view contains a self join on "Demo.dbo.User".

Let’s avoid self join by moving the join to a WHERE clause

CREATE VIEW [dbo].[CheckFriendView]
WITH SCHEMABINDING
AS
SELECT MyselfId, FriendId
FROM dbo.Friend AS f
WHERE EXISTS (
    SELECT 1
    FROM  dbo.[User] AS myself
    INNER JOIN dbo.Friend
    ON dbo.Friend.MyselfId = myself.Id
    INNER JOIN dbo.[User] AS myFriend
    ON dbo.Friend.FriendId = myFriend.Id
    WHERE myself.Id = f.MyselfId AND
          myFriend.Id = f.FriendId AND
          myself.DepartmentId <> myFriend.DepartmentId
)

GO

CREATE UNIQUE CLUSTERED INDEX IDX_CheckFriendView
   ON [dbo].[CheckFriendView] (MyselfId, FriendId)

GO

Another error comes up

Cannot create index on view "Demo.dbo.CheckFriendView" because it contains one or more subqueries. Consider changing the view to use only joins instead of subqueries. Alternatively, consider not indexing this view.

Looks like SQL Server will never let us join.

Approach 4 – Turning MyselfId and FriendId into a single column

Jonathan avoided the self join by combining the MyselfId and FriendId columns with the help of a numbers table containing 2 entries, then joining into the User table – http://jmkehayias.blogspot.sg/2008/12/creating-indexed-view-with-self-join.html

Let’s start by combining MyselfId and FriendId

CREATE VIEW [dbo].[CheckFriendOneColumnView]
WITH SCHEMABINDING
AS
SELECT MyselfId, CASE WHEN n.Number = 1 THEN MyselfId ELSE FriendId END AS MyselfOrFriendId
FROM dbo.Friend AS f
CROSS JOIN dbo.[Number] AS n

And then add the joining of User

INNER JOIN dbo.[User] AS u
ON f.UserId = u.Id OR f.FriendId = u.Id

Oh wait a minute, this is getting unnecessarily complicated. Could we just use

INNER JOIN dbo.[User] AS u
ON f.UserId = u.Id OR f.FriendId = u.Id

without the cross join?

Finally

We do an INNER JOIN with OR condition. Each row in Friend will become 2 rows, one with the DepartmentId for myself, the other with the DepartmentId for my friend.

Then aggregate the (MyselfId, DepartmentId). Since all the friends should belong to the same department, this aggregation should leave MyselfId as unique.

Lastly add the UNIQUE INDEX on MyselfId.

CREATE VIEW [dbo].[CheckFriendView]
WITH SCHEMABINDING
AS
SELECT f.MyselfId, u.DepartmentId, COUNT_BIG(*) AS BigCount
FROM dbo.Friend AS f
INNER JOIN dbo.[User] AS u
ON f.MyselfId = u.Id OR f.FriendId = u.Id
GROUP BY f.MyselfId, u.DepartmentId

GO

CREATE UNIQUE CLUSTERED INDEX IDX_CheckFriendView
    ON [dbo].[CheckFriendView] (MyselfId)

GO

Note: I had to add COUNT_BIG(*) AS BigCount due to another error

Cannot create index on view 'CheckFriendView' because its select list does not include  a proper use of COUNT_BIG. Consider adding COUNT_BIG(*) to select list.

which basically says I must add it!

Update:

Warning: while the INNER JOIN ON a OR b works, it results in very inefficient query (no JOIN predicate) and slows down every changes in the database!

No comments: