Sponsored By Aspose - File Format APIs for .NET

Aspose are the market leader of .NET APIs for file business formats – natively work with DOCX, XLSX, PPT, PDF, MSG, MPP, images formats and many more!

Unflattening a list, or "How to ask ‘how to’"

Code challenge time! And by code challenge, I mean, “Do my work for me so I don’t have to think/Google”.

We have a parent object, called Parent, and each one has a collection of child objects, called AllegedChild. Ignore potential cycles for this exercise. The database has a single table with fields: ParentID, ParentName, ChildID, ChildName. The data is such:

1, Coding Hillbilly, 1, Brandi-Lynn
1, Coding Hillbilly, 2, Sammy-Jo
1, Coding Hillbilly, 3, Sammy-Jo Jr.
1, Coding Hillbilly, 4, Sammy-Jo Sr.
2, Donald Belcham, 5, Justice Gray
2, Donald Belcham, 5, Dave Laribee
2, Donald Belcham, 5, Scott

We will retrieve this list into a collection of ParentChildDto objects with properties that mirror the database. From this list, we’d like to create a new list of Parent objects each with an appropriate list of AllegedChild objects.

The challenge, given a List<ParentChildDto>, what is a clean and efficient way of creating a List<Parent>? The current implementation looks something like this:

private List<Parent> GitEm( IEnumerable<ParentChildDto> flatList )
{
    rFrom = from p in flatList
            orderby p.ParentId
            select p;

    ParentChildDto prev = null;
    var to = new List<Parent>( );
    foreach( var dto in rFrom )
    {
        if(prev == null || prev.ParentId != dto.ParentId)
        {
            prev = new Parent
                       {
                           Id = dto.ParentId,
                           FullName = dto.Name
                       };
            to.Add(prev);
            prev.AllegedChildren = new List<AllegedChild>();
        }
        if(dto.ChildId != null)
            prev.AllegedChildren.Add(new AllegedChild
                                  {
                                      Id = dto.ChildId,
                                      FullName = dto.Name
                                  });
    }
    return to;
}

While this works, it looks a little too old-school what with the funky indentation and the use of a “previous” placeholder. But try as I might, I can’t think of another solution that would be cleaner or more maintainable. There’s a chance I may be too “close” to the problem though.

Honorary Hillbilly Status awaits the person who provides an answer deemed worthy of the domain upon which this example is based. Answers must assume a cordial tone and work within the context outlined. Ones that do not (hint: These will be the answers that take the form “Why are you doing XXXX in the first place?” or one of its variations) will be summarily dismissed.

Kyle the Advancefully Grateful

This entry was posted in Uncategorized. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • http://www.bradleyboveinis.com Brad

    This is how I solved a similar situation in a different domain (representing parent/child relationships)

    First of all, the domain model. Location contains other locations (children)

    public class Location : EntityBase
    {
    #region Properties

    public virtual Location ParentLocation { get; set; }
    public virtual string LocationName { get; set; }
    public virtual string LocationType { get; set; }
    public virtual string ThirdPartyId { get; set; }
    public virtual string ThirdPartySystem { get; set; }
    public virtual long? MapId { get; set; }

    private IList _locations = new List();
    public virtual IEnumerable
    Locations
    {
    get { return _locations; }
    }

    private IList _devices = new List();
    public virtual IEnumerable
    Devices
    {
    get { return _devices; }
    }

    private IList _assets = new List();
    public virtual IEnumerable
    Assets
    {
    get { return _assets; }
    }

    #endregion

    #region Constructor

    public Location()
    {
    // empty constructor so class can be instantiated then built up using methods provided for reflection if needed
    LocationName = string.Empty;
    MapId = null;
    }

    public Location(string locationName)
    {
    LocationName = locationName;
    }

    #endregion

    #region Helper Methods

    ///

    /// Gets all locations under this location.
    ///

    ///
    public virtual List GetAllContainedLocations()
    {
    List
    locs = new List();
    _locations.ToList().ForEach(l => locs.AddRange(l.GetAllContainedLocations()));
    locs.Add(this);
    return locs;
    }

    ///

    /// Gets all devices for this location and sub locations.
    ///

    ///
    public virtual IEnumerable GetAllAssetsUnderLocation()
    {
    var locs = GetAllContainedLocations();
    var t2 = new List
    ();
    locs.ForEach(l => t2.AddRange(l.Assets));
    return t2;
    }

    ///

    /// Gets all devices for this location and and sub locations.
    ///

    ///
    public virtual IEnumerable GetAllDevicesUnderLocation()
    {
    var locs = GetAllContainedLocations();
    var t2 = new List
    ();
    locs.ForEach(l => t2.AddRange(l.Devices));
    return t2;
    }

    ///

    /// Removes the device from this location.
    ///

    /// The device.
    public virtual void RemoveDevice(Device device)
    {
    if (_devices.Contains(device))
    _devices.Remove(device);
    }

    ///

    /// Adds a device to the location.
    ///

    /// The device.
    public virtual void AddDevice(Device device)
    {
    device.SetParentLocation(this);
    _devices.Add(device);
    }

    #endregion

    ///

    /// Validates this instance.
    ///

    ///
    protected override bool Validate()
    {
    return true && Locations.All(x => x.IsValid);
    }

    public virtual void AddLocation(Location newLocation)
    {
    newLocation.SetParentLocation(this);
    _locations.Add(newLocation);
    }

    public virtual void SetParentLocation(Location location)
    {
    ClearParentLocation();
    ParentLocation = location;
    }

    public virtual void ClearParentLocation()
    {
    if(ParentLocation != null)
    ParentLocation.RemoveLocation(this);
    ParentLocation = null;
    }

    public virtual void RemoveLocation(Location location)
    {
    if (_locations.Contains(location))
    _locations.Remove(location);
    }

    public virtual void AddAsset(Asset asset)
    {
    asset.SetParentLocation(this);
    _assets.Add(asset);
    }

    public virtual void RemoveAsset(Asset asset)
    {
    if (_assets.Contains(asset))
    _assets.Remove(asset);
    }

    }

    Then the DTO

    public class LocationCreatedEventMessage : EventMessageBase
    {
    public virtual Guid LocationId { get; set; }
    public virtual string LocationName { get; set; }
    public virtual Guid ParentLocationId { get; set; }
    public virtual long MapId { get; set; }
    public virtual Guid[] LocationIds { get; set; }
    public virtual Guid[] DeviceIds { get; set; }
    public virtual Guid[] AssetIds { get; set; }
    public virtual string LocationType { get; set; }
    public virtual string ThirdPartyId { get; set; }
    public virtual string ThirdPartySystem { get; set; }
    }

    Flattening it out into our DTO:

    LocationCreatedEventMessage newMessage = new LocationCreatedEventMessage
    {
    CausalMessageId = message.Id,
    DeviceIds = location.Devices.Select(x => x.Id).ToArray(),
    AssetIds = location.Assets.Select(x => x.Id).ToArray(),
    Id = Guid.NewGuid(),
    LocationId = location.Id,
    LocationIds = location.Locations.Select(x => x.Id).ToArray(),
    LocationName = location.LocationName,
    LocationType = location.LocationType,
    MapId = location.MapId.HasValue ? location.MapId.Value : -1,
    ParentLocationId = location.ParentLocation == null ? Guid.Empty : location.ParentLocation.Id, //The version of NServiceBus we use currently doesn’t support nullable types
    TimeStampUtc = DateTime.Now.ToUniversalTime(),
    ThirdPartyId = location.ThirdPartyId,
    ThirdPartySystem = location.ThirdPartySystem
    };

    The null object pattern (also used with device and assets)

    public class MissingLocation : Location
    {
    public override string LocationName
    {
    get
    {
    return “Missing Location Information”;
    }
    set
    {
    }
    }

    protected override bool Validate()
    {
    return false;
    }
    }

    Finally, the message handler on the client side (we use nservicebus).

    public class LocationMessageHandlers
    : IMessageHandler

    {
    #region Implementation of IMessageHandler

    ///

    /// Handles a message.
    ///

    /// The message to handle.
    ///
    /// This method will be called when a message arrives on the bus and should contain
    /// the custom logic to execute when the message is received.
    ///

    public void Handle(LocationCreatedEventMessage message)
    {
    #region Create the location object

    Location location = GetLocation(message.LocationId);

    location.Id = message.LocationId;
    location.LocationName = message.LocationName;
    location.MapId = (message.MapId < 0) ? new long?() : message.MapId;
    location.LocationType = message.LocationType;
    location.ThirdPartyId = message.ThirdPartyId;
    location.ThirdPartySystem = message.ThirdPartySystem;

    if (message.ParentLocationId != Guid.Empty)
    GetAssociatedLocation(message.ParentLocationId).AddLocation(location);

    foreach (var id in message.DeviceIds)
    {
    var device = DeviceCache.Instance.FindByKey(id) ?? new MissingDevice() { Id = id };
    location.AddDevice(device);
    }

    foreach (var id in message.AssetIds)
    {
    var asset = AssetCache.Instance.FindByKey(id) ?? new MissingAsset() { Id = id };
    location.AddAsset(asset);
    }

    foreach (var id in message.LocationIds)
    {
    var loc = GetAssociatedLocation(id);
    location.AddLocation(loc);
    }

    #endregion
    }

    private Location GetLocation(Guid guid)
    {
    var location = LocationCache.Instance.FindByKey(guid) ?? CreateLocation(guid);
    if (location is MissingLocation)
    {
    LocationCache.Instance.Remove(location.Id);
    if (location.ParentLocation != null)
    location.ClearParentLocation();
    location = CreateLocation(guid);

    }
    return location;
    }

    private Location CreateLocation(Guid guid)
    {
    var loc = new Location { Id = guid };
    LocationCache.Instance.Add(guid, loc);
    return loc;
    }

    private Location GetAssociatedLocation(Guid guid)
    {
    return LocationCache.Instance.FindByKey(guid) ?? CreateMissingLocation(guid);
    }

    private MissingLocation CreateMissingLocation(Guid guid)
    {
    var loc = new MissingLocation { Id = guid };
    LocationCache.Instance.Add(guid, loc);
    return loc;
    }

    #endregion
    }

  • Robert Slaney

    @Pedro
    I ask all my developers to understand how List, even XML types, are implemented under the hood… expecially all the LINQ method. What a fanstatic tool Reflector is.

    Take for example the old XmlChildNodes class ( the one you get when you ask for .Children on a node ). Did you know that when you ask for n-th node it starts at the first child and repeated calls .NextSibling until it counts to “n”.

    — Snippet from System.Xml.XmlChildNodes class —
    public override XmlNode Item(int i)
    {
    if (i >= 0)
    {
    XmlNode firstChild = this.container.FirstChild;
    while (firstChild != null)
    {
    if (i == 0)
    {
    return firstChild;
    }
    firstChild = firstChild.NextSibling;
    i–;
    }
    }
    return null;
    }

    Some of our devs don’t like to use the foreach construct if you have an indexer available because they think it is ALWAYS quicker.

    for ( int i = 0; i < nodes.Count; i++ )
    {
    XmlNode node = nodes[i];

    }

    In this example if nodes.Count = 5, then you will internally iterate though the node list’s for loop 15 times, 10 children elements means 55 times.

  • http://www.digitaldias.com Pedro G. Dias

    >> it looks a little too old-school what with the funky indentation and the use of a “previous” placeholder

    That old and fonky look is the magic underneath the implementation of List and any other Collection algorithm you can think of. It’s not gone, it’s just hidden, so that you won’t have to bother to learn how to :p

  • http://weblogs.asp.net/bleroy Bertrand Le Roy

    Yeeeehaaaaaaa!

  • http://codebetter.com/members/kylebaley/default.aspx Kyle Baley

    Hmmm…Should be careful when I ask people to do my work for me. So many good suggestions, it ends up being more work than if I just Googled it.

    Nice to see people having fun with it. Winner to be announced at a later date. And while Bertrand doesn’t get full Hillbilly status for not working within the parameters, I’ll award him a Golden Duellin’ Banjo for introducing me to (or possibly reminding me of; university was a long time ago) interval trees. I’ll send you an invoice for the time I spent reading about them when I should have been billable.

  • http://www.colourcoding.net/blog Julian Birch

    Maybe I’m missing something here:

    return (from p in flatList
    group new AllegedChild() { Id = p.ChildId, FullName = p.ChildName }
    by new { p.ParentId, p.Name } into g
    select new Parent() {
    AllegedChildren = g.ToList(),
    Id = g.Key.ParentId,
    FullName = g.Key.Name
    }).ToList();

    It doesn’t handle grandparents, but nor does the original code, as for as I can tell.

    Incidentally Robert has hit the nail on the head of observing that the need to pull in the parent’s name makes the whole thing more complex.

  • Adiomarin

    Something like this:

    public IList GetParentsRefTwo(IEnumerable flatSource)
    {
    var qry = (from dto in flatSource
    let children = flatSource
    .Where(c => c.ParentId == dto.ParentId)
    .Select(c => new AllegedChild {ChildId = c.ChildId, ChildName = c.ChildName})
    orderby dto.ParentId
    select new Parent
    {
    ParentId = dto.ParentId,
    ParentName = dto.ParentName,
    AllegedChildren = children.ToList()
    }
    ).Distinct(new ParentComparer());
    return qry.ToList();
    }

  • Robert Slaney

    private List GitEmAgain(IEnumerable flatList)
    {
    return flatList
    .GroupBy(
    dto => new Parent() { Id = dto.ParentId, FullName = dto.Name },
    dto => new AllegedChild() { Id = dto.ChildId, FullName = dto.ChildName },
    (parent, children) =>
    {
    parent.AllegedChildren = children.ToList();
    return parent;
    }
    , new ParentComparer()
    ).ToList();
    }

    public class ParentComparer : EqualityComparer {
    public override bool Equals( Parent x, Parent y )
    {
    if ( x == null ^ y == null )
    return false;
    return x.Id == y.Id && x.FullName == y.FullName;
    }

    public override int GetHashCode( Parent obj )
    {
    return obj.Id;
    }
    }

  • http://judahgabriel.blogspot.com Judah Himango

    Are you asking how to generate an object graph from a sequence of ParentChild objects? If so, I had asked this question on StackOverflow.com and got some good answers, along with one nice generic function:

    http://stackoverflow.com/questions/947831/c-algorithm-for-generating-hierarchy

  • Marty Daryl Wilbur

    That there GroupBy is jus’ wut ya need, man! Ol’ Jimbo ain’t steerin’ ya too wrong there.

    If I gotta tree-bin me some list data ‘n Ima stuck without our good buddy LINQ, or f’r what ever reason I wanna do yer kinda iteration there, I use me a co-lection class o’ some sort ‘n let it worry ’bout membership. So may be we have:

    var family = new Dictionary();
    foreach( var dto in flatList )
    {
    if (! family.ContainsKey(dto.ParentID))
    {
    family.Add(dto.ParentID, new Parent { … });
    family[dto.ParentID].AllegedChildren = new List();
    }

    if (dto.ChildID != null)
    {
    family[dto.ParentID].AllegedChildren.Add( new AllegedChild { … } );
    }
    }

    // Damn yer eyes, Dictionary.Values!
    Parent[] parents = new Parent[family.Count];
    family.Values.CopyTo(parents, 0);
    return new List (parents);

    I wouldn’t go’n call it pretty, but it done got ridda all that prev’ius noise while keepin’ the spirit o’ yer approach.
    If’n LINQ be “new-school”, then this is old-school++. In-spired by perl, in fact.

  • http://weblogs.asp.net/bleroy Bertrand Le Roy

    If it is OK to change the database shape, you might want to give a look at interval trees, which are fast to read, slower to write. But I’m suspecting this may be the kind of answer that you are not looking for. Still, if you don’t know them, check them out, they are a nice data structure for hierarchical data that is too often overlooked.

  • http://twitter.com/randolphcabral Randolph Cabral

    This was fun! Here’s my lazy-lambda way:

    private List GitEm(IEnumerable flatList)
    {
    var parentlist = new List ();

    flatList.Select(p => new Parent { Id = p.ParentId, FullName = p.ParentName })
    .Distinct(new ParentComparer()).ToList()
    .ForEach(p=>
    {
    p.AllegedChildren = new List();
    p.AllegedChildren.AddRange(flatList
    .Where(c => c.ParentId == p.Id)
    .Select(c => new AllegedChild { Id = c.ChildId, FullName = c.ChildName}));
    parentlist.Add(p);
    });

    return parentlist;
    }

  • Jim Bolla

    The LINQ .GroupBy() method is what you want…

    var normalized = flat.GroupBy(
    p => new { p.ParentId, p.ParentName },
    p => new { p.ChildId, p.ChildName },
    (p, cs) => new Parent { Id = p.ParentId, FullName = p.ParentName, Children = cs.Select(c => new AllegedChild { Id = c.ChildId, FullName = c.ChildName}).ToList() });

    or something thereabouts.

  • http://devlicio.us/blogs/tuna_toksoz Tuna Toksoz

    Hmm, NHibernate algorithms would help here, maybe. Have to look.

  • http://www.jasondentler.com/blog Jason Dentler

    I’m thinking something with LINQ’s Distinct would work… (excuse the VB)

    Dim ParentsQuery = From dto In flatList Select P = New Parent() With {.Id = dto.ParentId, .Name = dto.ParentName} Distinct
    Select P, Children = (From dto2 In flatList Where dto2.ParentId = P.Id Select New Child() With {.Id = dto.ChildId, .Name = dto2.ChildName, .Parent = P}

    For Each item In ParentsQuery
    item.p.children = item.Children
    Next

    You could also do something similar with a Hashset. I don’t know about performance for either one, but I think the code is definitely easier to understand.