Episerver Content Migration With Json.Net

The migration of content (also known as “lift & shift”) is the process of copying content from an existing platform into the new website.

One of the most overlooked aspects on a lot of projects is ‘content migration’. When companies switch CMS platforms, change technology stacks or even upgrade between versions of the same CMS, in a lot of situations you will need to get the data from the old system into the new system. I hate to burst your bubble but this is hardly ever easy.

Manual or Automatic

My first tip surrounding content migration is to clearly think if the time, cost and loss of development effort will be worth it in your situation. Manually migrating content can take weeks and often months depending on what you need to migrate.

Hiring some content editors will be a lot more cost effective than getting a developer to spec out and implement the migration script.

Using an off the shelf content migration product can be expensive and, in the majority of situations, not even possible. The other issue with an automatic solution is that it will tie up your best people to implement the code. While your developers burn precious project man hours on migration, they are not working on anything else. From my experience content migration typically takes longer and can be tedious.

There are definitely a lot of situations where automatic makes sense. Maybe you need to run the scripts more than once, maybe you don’t have access to content editors, whatever the reason, in today’s guide, we are going to cover importing content via Json.

Scheduled Tasks

In Episerver it makes a lot of sense to pop your content migration scripts within a scheduled task, why?

  • The code may only be run once or twice
  • Scheduled jobs are only availabe in the admin, so content editors can’t access it
  • There’s not a ot of other options 🙂

I’ve written previously in How To Set Up A Scheduled Task about the basics of a scheduled task.

The first thing we need is our Json file, which looked like this:

{"name": "Page 6", "seotitle": "Seo Title 6", "keywords": "Keywords 6"}

In this guide, we will only import the basics which is a page title and some properties like page title and page keywords. The data will be imported into a Content Page:

[ContentType(
DisplayName = "Content Page",
GUID = "8dee011f-8dbf-43ab-b4f3-211db5ceb9d5",
Description = "Content Page",
GroupName = "Standard")]
public class ContentPage : GlobalBasePage
{
[Display(
Name = "Page Title",
Description = "Page Title",
GroupName = SystemTabNames.Content,
Order = 100)]
[CultureSpecific]
public virtual string PageTitle { get; set; }
[Display(
Name = "Main Content Area",
Description = "Region where content blocks can be placed",
GroupName = SystemTabNames.Content,
Order = 200)]
public virtual ContentArea MainContentArea { get; set; }
}

The Scheduled Task

[ScheduledPlugIn(DisplayName = "Content Page Importer",
SortIndex = 100)]
public class ContentPageImporter : JobBase
{
private static readonly ILog Logger = LogManager.GetLogger(MethodBase.GetCurrentMethod().DeclaringType);
private int processedNodes;
private int failedNodes;
public ContentPageImporter()
{
processedNodes = 0;
failedNodes = 0;
}
private long Duration { get; set; }
public override string Execute()
{
var timer = Stopwatch.StartNew();
var epiServerDependencies = ServiceLocator.Current.GetInstance<IEpiserverDependencies>();
var fileDirectory = FileHelper.GetImportDirectoryPath();
var jsonFiles = FileHelper.GetFiles(fileDirectory);
if (jsonFiles.Any())
{
ProcessFiles(epiServerDependencies, jsonFiles);
}
else
return "No files to process";
timer.Stop();
Duration = timer.ElapsedMilliseconds;
return ToString();
}
private void ProcessFiles(IEpiserverDependencies epiServerDependencies, List<string> jsonFiles)
{
foreach (var jsonFile in jsonFiles)
{
using (var streamReader = new StreamReader(jsonFile))
{
var json = streamReader.ReadToEnd();
ContentPageData contentPageData;
var settings = new JsonSerializerSettings
{
NullValueHandling = NullValueHandling.Ignore
};
try
{
contentPageData = JsonConvert.DeserializeObject<ContentPageData>(json, settings);
}
catch (JsonSerializationException ex)
{
Logger.Error(string.Format("Invalid Json file {0}", jsonFile), ex);
failedNodes = failedNodes + 1;
continue;
}
catch (JsonReaderException ex)
{
Logger.Error(string.Format("Invalid Json format within {0}", jsonFile), ex);
failedNodes = failedNodes + 1;
continue;
}
contentPageData.ParentContentReference = ContentReference.RootPage;
var contentPageRepository = new ContentPageRepository(epiServerDependencies);
var contentPageReference = contentPageRepository.CreateContentPage(contentPageData);
if (contentPageReference == null)
{
Logger.ErrorFormat("Unable to get create blog page {0} ", contentPageData.PageName);
failedNodes = failedNodes + 1;
continue;
}
processedNodes = processedNodes + 1;
}
}
}
public override string ToString()
{
return string.Format(
"Imported {0} pages successfully in {1}ms on. {2} page(s) failed to import.",
processedNodes,
Duration,
failedNodes);
}
}

First, we create a new class that implements from JobBase. We define all standard [ScheduledPlugIn] properties. We then define a few properties to allow us to display back to the end user how many files we have imported, how many have failed and how long it took.

As in any Episerver scheduled job the real meat happens in the Execute method. In here we use the FileHelper class to read in all the Json files. The FileHelper looks like this:

public static class FileHelper
{
public static List<string> GetFiles(string directory)
{
var files = new List<string>();
files.AddRange(Directory.GetFiles(directory, "*.json", SearchOption.TopDirectoryOnly));
return files;
}
public static string GetImportDirectoryPath()
{
var webRoot = new DirectoryInfo(HostingEnvironment.ApplicationPhysicalPath);
return string.Format("{0}\\Import\\", webRoot);
}

After we get all the files we then use Json.Net to populate a custom object I’ve created called ‘ContentPageData’.

public class ContentPageData
{
[JsonProperty(PropertyName = "name")]
public string PageName { get; set; }
public ContentReference ParentContentReference { get; set; }
[JsonProperty(PropertyName = "seotitle")]
public string SeoTitle { get; set; }
[JsonProperty(PropertyName = "keywords")]
public string Keywords { get; set; }
}

This object will be automatically populated via Json.Net and passed into a create page method. If you have never come across Json.Net before I’ve written an introduction about it in this guide: Importing Data Using Json.Net.

The code to do all this can be seen here:

var json = streamReader.ReadToEnd();
ContentPageData contentPageData;
var settings = new JsonSerializerSettings
{
NullValueHandling = NullValueHandling.Ignore
};
try
{
contentPageData = JsonConvert.DeserializeObject<ContentPageData>(json, settings);
}
catch (JsonSerializationException ex)
{
Logger.Error(string.Format("Invalid Json file {0}", jsonFile), ex);
failedNodes = failedNodes + 1;
continue;
}
catch (JsonReaderException ex)
{
Logger.Error(string.Format("Invalid Json format within {0}", jsonFile), ex);
failedNodes = failedNodes + 1;
continue;
}

After we have our custom object populate with all the data from the Json file, we then set any custom properties like the parent root. I’m then passing the object into the content repository to create a new Episerver object:

contentPageData.ParentContentReference = ContentReference.RootPage;
var contentPageRepository = new ContentPageRepository(epiServerDependencies);
var contentPageReference = contentPageRepository.CreateContentPage(contentPageData);
if (contentPageReference == null)
{
Logger.ErrorFormat("Unable to get create blog page {0} ", contentPageData.PageName);
failedNodes = failedNodes + 1;
continue;
}
processedNodes = processedNodes + 1;

Which call the CreateContentPage in ContentPageRepository which looks like this:

public class ContentPageRepository
{
private IEpiserverDependencies _epiServerDependencies;
public ContentPageRepository(IEpiserverDependencies epiServerDependencies)
{
_epiServerDependencies = epiServerDependencies;
}
public ContentPage CreateContentPage(ContentPageData contentPageData)
{
var existingPage = _epiServerDependencies
.ContentRepository
.GetChildren<ContentPage>(ContentReference.RootPage)
.FirstOrDefault(x => x.PageTitle == contentPageData.PageName);
if (existingPage != null)
return existingPage;
var newPage = _epiServerDependencies.ContentRepository
.GetDefault<ContentPage>(contentPageData.ParentContentReference);
newPage.PageTitle = contentPageData.PageName;
newPage.Name = contentPageData.PageName;
newPage.SeoTitle = contentPageData.SeoTitle;
newPage.Keywords = contentPageData.Keywords;
return Save(newPage) != null ? newPage : null;
}
public ContentReference Save(ContentPage contentPage,
SaveAction saveAction = SaveAction.Publish,
AccessLevel accessLevel = AccessLevel.NoAccess)
{
if (contentPage == null)
return null;
return _epiServerDependencies.ContentRepository
.Save(contentPage, saveAction, accessLevel);
}
}

Conclusion

In today’s guide, we’ve talked about the pro’s and con’s of going down an automatic content migration route. We’ve created a custom scheduled task that reads Json files in from an Import directly locate in our webroot and then converts the data using Json.Net into a Content Page

Code Sample

As always a fully working code sample can be downloaded from my Github page JonDJones.com.EpiserverContentMigrationWithJson.

Jon D Jones

Software Architect, Programmer and Technologist Jon Jones is founder and CEO of London-based tech firm Digital Prompt. He has been working in the field for nearly a decade, specializing in new technologies and technical solution research in the web business. A passionate blogger by heart , speaker & consultant from England.. always on the hunt for the next challenge

More Posts

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *