SQL - Purchase Order and Invoice duplicate amount

Hi Everyone.

I hope you can help me to resolve my issues:

  1. My aim is to gather accurate spend/volume data from businesses.
  2. Usually, a purchase order (PO) comes first, before invoice.
  3. I want to be able to get one purchase order per invoiceNo based on the following condition.(a). If the source is “PO” or “INV” and have the same InvoiceNO, eg, 930015980, and the PO has value (“PO Line Amount”), then I want to only retain PO where source = PO (b). However, if PO “PO Line Amount”) has no value, I want to retain the INV where source = INV (PO Line Invoice Amount) (c). If both have value, I want to retain only Purchase order.
  4. I don’t want to have both INV and PO amount for the same invoiceNo and Vendor_Number.

Please my Test code below: Thanks for your help.

IF OBJECT_ID('tempdb..#Purchase_Invoice_Results') IS NOT NULL DROP TABLE #Purchase_Invoice_Results
GO
CREATE Table #Purchase_Invoice_Results(
[Vendor_Number] varchar NULL,
[VendorName] varchar NULL,
[InvoiceNo] varchar NULL,
[legalEntity] varchar NULL,
[Source] varchar NULL,
[Currency] varchar NULL,
[PO Line Amount] varcharNULL,
[Po Line Invoice Amount] varcharNULL,
[PO Line Receipt Amount] varcharNULL,
[LEPurchaseOrderLineInvoiceAmount] varcharNULL,
[PO Line Price] varcharNULL,
[InvoiceEntryFiscalYear] varcharNULL,
[InvoiceEntryDate] varcharNULL,
[invoicePostedFiscalYear] varcharNULL,
[invoicePostedDate] varcharNULL,
[PurchaseOrderFiscalYear] varchar NULL,
[PurchaseOrderDate] varchar NULL

) ON [PRIMARY]
GO

INSERT INTO #Purchase_Invoice_Results([Vendor_Number],[VendorName]
,[InvoiceNo],[legalEntity],[Source],[Currency],[PO Line Amount],[Po Line Invoice Amount],[PO Line Receipt Amount]
,[LEPurchaseOrderLineInvoiceAmount]
,[PO Line Price],[InvoiceEntryFiscalYear],[InvoiceEntryDate] ,[invoicePostedFiscalYear] ,[invoicePostedDate]
,[PurchaseOrderFiscalYear],[PurchaseOrderDate]
)
VALUES
('EUX-100530','Western Foods Plc.','930015991','RLM01','INV','HKD', '0', '0', '0', '22776.24', '0', '2025','15/09/2024','2025','27/10/2024', null,null),
('EUX-100530','Western Foods Plc.','930015980','RLM01','INV','HKD', '0','0','0','21639.66','0','2025','08/09/2024','2025', '08/11/2024', null,null),
('EUX-100530','Western Foods Plc.','930015992','RLM01','INV','HKD', '0','0','0','22179.86','0','2025','15/09/2024','2025','27/10/2024',NULL,NULL),
('EUX-100530','Western Foods Plc.','930015980','RLM01','PO', 'HKD', '231012.05','210855.48','231012.05','21194.27','121.632','2025','08/09/2024','2025','08/11/2024','2024','12/06/2024'),
('EUX-100530','Western Foods Plc.','930015991','RLM01','PO','HKD','250504.79','228912.16','250504.79','23216.24','136.824','2025','15/09/2024','2025','27/10/2024','2024','24/06/2024'),
('EUX-100530','Western Foods Plc.','930015992','RLM01','PO','HKD','234667.09','222890.63','234667.09','22605.56','90.548571','2025','15/09/2024','2025','27/10/2024' ,'2024','28/06/2024'),
('EUX-100530','Western Foods Plc.','930015993','RLM01','PO','HKD','216701.58','216701.58','216701.58','21781.91','66.575','2025', '15/09/2024','2025','27/10/2024','2024','12/06/2024'),
('EUX-100530','Western Foods Plc.','930015993','RLM01','INV', 'HKD', '0','0' ,'0','21566.89','0','2025','15/09/2024','2025', '27/10/2024',NULL,NULL);

Select * from #Purchase_Invoice_Results

I'm having some difficulty understanding the example and test code provided, as it doesn't seem to execute correctly on my instance of Microsoft SQL Server. That said, I believe I understand the intended logic. Here's an approach I would recommend:

;WITH CTE_SmartSolution AS
(
SELECT
Vendor_Number,
VendorName,
InvoiceNo,
MAX(CASE WHEN Source='PO' THEN [PO Line Amount] END) AS POLineAmount_PO,
MAX(CASE WHEN Source='INV' THEN [PO Line Amount] END) AS POLineAmount_INV
FROM #Purchase_Invoice_Results
GROUP BY
Vendor_Number,
VendorName,
InvoiceNo
)
SELECT
*
FROM CTE_SmartSolution

A few additional observations:

  • If your test data stores numeric values as VARCHAR, you may encounter issues with calculations and aggregations. It's advisable to use appropriate data types for numeric fields.

  • The consistency of column naming is important. For example, Vendor_Number and VendorName should follow a consistent naming convention to improve readability and maintainability.

  • Proper formatting and naming conventions will save you time and frustration in the long run—especially when scaling or integrating with other systems.

I hope this helps clarify the approach. Let me know if you'd like assistance refining the logic further or troubleshooting the test code.

Disclaimer
Since English is not my first language, I used Copilot to help refine my response and present it in a more professional manner, including this disclaimer :slight_smile:

Hi Rogier

Many thanks. However, I only want to get one Invoice_NO, eg - 930015980. The logic would be if the PO has value, keep Purchase order amount and disregard the PO Line Invoice Amount. If both have value, take the Purchase order amount. If the Purchase order amount is null, keep the PO Line Invoice Amount. I don’t want to have duplicate or repeated InvoiceNO. Hope I have explained myself. Thanks

Could you please clarify what you mean by "PO has value"? As far as I can see, "Purchase Order Amount" is not a defined column in the dataset. If it's returning null, it would be helpful to understand how that’s occurring—especially since no sample data was provided to illustrate the issue.

It's difficult to assist effectively when there are inconsistencies or missing context. To help move things forward, I’ve created a query that generates one row per invoice. The first part of the query builds the necessary base table, and the second part adds the logic to produce the final result. This structure makes the query easier to maintain and more accessible for others to support.

;WITH CTE_SmartSolution AS
(
SELECT
Vendor_Number,
VendorName,
InvoiceNo,
MAX(CASE WHEN Source='PO' THEN [PO Line Amount] END) AS POLineAmount_PO,
MAX(CASE WHEN Source='PO' THEN [PO Line Invoice Amount] END) AS POLineInvoiceAmount_PO,
MAX(CASE WHEN Source='INV' THEN [PO Line Amount] END) AS POLineAmount_INV
FROM #Purchase_Invoice_Results
GROUP BY
Vendor_Number,
VendorName,
InvoiceNo
)
SELECT
InvoiceNo,
POLineAmount_PO,
POLineInvoiceAmount_PO,
CASE WHEN POLineAmount_PO <> '0' THEN POLineInvoiceAmount_PO ELSE POLineAmount_PO END AS AccurateData
FROM CTE_SmartSolution

Based on your initial description, I hope the query aligns with your intent. If it doesn’t meet your needs, please provide a clear example of the final output you're aiming for, including relevant fields and sample data. That will allow me to assist you more accurately.

Disclaimer
Since English is not my first language, I used Copilot to help refine my response and present it in a more professional manner, including this disclaimer :slight_smile:

Thank you so much. I will comeback to you after testing the codes. Once again, I appreciate your time and gesture.

Many Thanks.

hi

hope this helps

extremely simple way of explaining to others
extremely easy to understand
keyword "E X T R E M E L Y"

same invoiceNo and Vendor_Number

"PO Line Amount" has value 
take "PO Line Amount"

"PO Line Amount" has NO value 
take "PO Line Invoice Amount"

"PO Line Amount" , "PO Line Invoice Amount" both value 
take "LEPurchaseOrderLineInvoiceAmount"

create table with appropriate "Data Types"
insert sample data

IF OBJECT_ID('tempdb..#Purchase_Invoice_Results') IS NOT NULL DROP TABLE #Purchase_Invoice_Results
GO

CREATE TABLE #Purchase_Invoice_Results (
    [Vendor_Number] varchar(100) NULL,
    [VendorName] varchar(100) NULL,
    [InvoiceNo] int NULL,
    [legalEntity] varchar(100) NULL,
    [Source] varchar(100) NULL,
    [Currency] varchar(100) NULL,
    [PO Line Amount] numeric(20,2) NULL,
    [Po Line Invoice Amount] numeric(20,2) NULL,
    [PO Line Receipt Amount] numeric(20,2) NULL,
    [LEPurchaseOrderLineInvoiceAmount] numeric(20,2) NULL,
    [PO Line Price] numeric(20,2) NULL,
    [InvoiceEntryFiscalYear] int NULL,
    [InvoiceEntryDate] date NULL,
    [invoicePostedFiscalYear] int NULL,
    [invoicePostedDate] date NULL,
    [PurchaseOrderFiscalYear] int NULL,
    [PurchaseOrderDate] date NULL
) ON [PRIMARY];
GO


INSERT INTO #Purchase_Invoice_Results (
  [Vendor_Number], [VendorName], [InvoiceNo], [legalEntity],
  [Source], [Currency], [PO Line Amount], [Po Line Invoice Amount],
  [PO Line Receipt Amount], [LEPurchaseOrderLineInvoiceAmount], [PO Line Price],
  [InvoiceEntryFiscalYear], [InvoiceEntryDate], [invoicePostedFiscalYear], [invoicePostedDate],
  [PurchaseOrderFiscalYear], [PurchaseOrderDate]
) VALUES
-- Row A: PO Line Amount has value -> LEO = PO Line Amount
('V12345', 'Vendor Alpha', 555001, 'LE-XYZ', 'SystemA', 'USD',
  100.00, NULL, NULL, 100.00, 50.00,
  2016, '2016-05-26', 2016, '2016-05-26',
  2016, '2016-05-26'),

-- Row B: PO Line Amount NULL, PO Line Invoice Amount has value
('V12345', 'Vendor Alpha', 555001, 'LE-XYZ', 'SystemA', 'USD',
  NULL, 75.00, NULL, 75.00, 50.00,
  2016, '2016-05-27', 2016, '2016-05-27',
  2016, '2016-05-27'),

-- Row C: Both values present -> PO Line Amount takes precedence
('V12345', 'Vendor Alpha', 555001, 'LE-XYZ', 'SystemA', 'USD',
  120.00, 80.00, NULL, 120.00, 50.00,
  2016, '2016-06-28', 2016, '2016-06-28',
  2016, '2016-06-28'),

-- Row D: Both null
('V12345', 'Vendor Alpha', 555001, 'LE-XYZ', 'SystemA', 'USD',
  NULL, NULL, NULL, NULL, 50.00,
  2016, '2016-06-29', 2016, '2016-06-29',
  2016, '2016-06-29'),

-- Row E: PO Line Amount = 0 treated as no value, PO Line Invoice Amount has value
('V12345', 'Vendor Alpha', 555001, 'LE-XYZ', 'SystemA', 'USD',
  0.00, 90.00, NULL, 90.00, 50.00,
  2016, '2016-07-28', 2016, '2016-07-28',
  2016, '2016-07-28'),

-- Row F: PO Line Amount value, different date window
('V12345', 'Vendor Alpha', 555001, 'LE-XYZ', 'SystemA', 'USD',
  60.00, NULL, NULL, 60.00, 50.00,
  2016, '2016-07-29', 2016, '2016-07-29',
  2016, '2016-07-29'),

-- Row G: Large value test
('V12345', 'Vendor Alpha', 555001, 'LE-XYZ', 'SystemA', 'USD',
  250.00, 200.00, NULL, 250.00, 50.00,
  2016, '2016-08-01', 2016, '2016-08-01',
  2016, '2016-08-01');

select 'Sample Data',* from #Purchase_Invoice_Results

hi

hope this helps

i could be Wrong "just an open ended discussion"